Hacker News

So after transforming multispectral satellite data into a 128-dimensional embedding vector you can play "Where's Wally" to pinpoint blackberry bushes? I hope they tasted good! I'm guessing you can pretty much pinpoint any other kind of thing as well then?

avsm 3 days ago [ - ]

Yes it's very good fun just exploring the embeddings! It's all wrapped by the geotessera Python library, so with uv and gdal installed just try this for your favourite region to get a false-colour map of the 128-dimensional embeddings:

  # for cambridge
  # https://github.com/ucam-eo/geotessera/blob/main/example/CB.geojson
  curl -OL https://raw.githubusercontent.com/ucam-eo/geotessera/refs/heads/main/example/CB.geojson
  # download the embeddings as geotiffs
  uvx geotessera download --region-file CB.geojson -o cb2
  # do a false colour PCA down to 3 dimensions from 128
  uvx geotessera visualize cb2 cb2.tif
  # project onto webmercator and visualise using leafletjs over openstreetmap
  uvx geotessera webmap cb2.tif --output cb2-map --serve

Because the embeddings are precomputed, the library just has to download the tiles from our server. More at: https://anil.recoil.org/notes/geotessera-python

Downstream classifiers are really fast to train (seconds for small regions). You can try out a notebook in VSCode to mess around with it graphically using https://github.com/ucam-eo/tessera-interactive-map

The berries were a bit sour, summer is sadly over here!

throwup238 3 days ago [ - ]

This is all far outside of my wheel house but I'm curious if there's any way to use this for rocks and geology? Identifying dikes and veins on cliff sides from satellites would be really cool.

jofer 3 days ago [ - ]

A major limitation is that most different rock types look essentially identical in visual+NIR spectral ranges. Things separate once you get out to SWIR bands. Sentinel2 does have some SWIR bands and it may work reasonably well with embeddings. But a lot of the signal the embeddings are going to be focused on encoding may not be the right features to distinguish rock types. Methods more focused specifically on the SWIR range are more likely to work reliably. E.g. simple band ratios of SWIR bands may give a cleaner signal than general purpose embeddings in this case.

Hyperspectral in the SWIR range is what you really want for this, but that's a whole different ball game.

throwup238 3 days ago [ - ]

> Hyperspectral in the SWIR range is what you really want for this, but that's a whole different ball game.

Are there any hyperspectral surveys with UAVs etc instead of satellites?

jofer 3 days ago [ - ]

Usually airplanes because the instruments are heavy. But yeah, that's the most common case. Hyperspectral sats are much rarer than aerial hyperspectral.

avsm 2 days ago [ - ]

An interesting 30x30m satellite that recently launched and is giving back data last year is EnMAP https://www.enmap.org. Hooking that up to TESSERA is on our todo list as soon as we can get our mittens on the satellite data

sadiq 3 days ago [ - ]

It might work. TESSERA's embeddings are at a 10 metre resolution, so it might depend on the size of the features you are looking for. If those features have distinct changes in colour or texture over time or they scatter radar in different ways compared with their surroundings then you should be able to discriminate them.

The easiest way to test is to try out the interactive notebook and drop some labels in known areas.

throwup238 3 days ago [ - ]

Is there a way to cluster the embeddings spatially or look for patterns isolated so some dimensions? (Again, way out of my wheel house)

What I mean is a vein is usually a few meters wide but can be hundreds of meters long so ten meter resolution is probably not very helpful unless the embeddings can encode some sort of pattern that stretches across many cells.

sadiq 3 days ago [ - ]

It's possible to use embeddings as input to a convolutional network and then train that using labels. We've done that for at least one of the downstream tasks in the TESSERA paper: https://arxiv.org/abs/2506.20380 to estimate canopy height.

The downside of that approach is that you need to spend valuable labels on learning the spatial feature extraction during training. To fix that we're working on building some pre-trained spatial feature extractors that you should only need to minimally fine-tune.

tony_cannistra 3 days ago [ - ]

almost definitely!

Waterluvian 3 days ago [ - ]

I haven’t done this kind of thing since undergrad, but hyperspectral data is really frickin cool this way. Not only can you use spectral signatures to identify specific things, but also figure out what those things are made out of by unmixing the spectra.

For example, figure out what crop someone’s growing and decide how healthy it is. With sufficient temporal resolution, you can understand when things are planted and how well they’re growing, how weedy or infiltrated they are by pest plants, how long the soil remains wet or if rainwater runs off and leaves the crop dry earlier than desired. Etc.

If you’re a good guy, you’d leverage this data to empower farmers. If you’re an asshole, you’re looking to see who has planted your crop illegally, or who is breaking your insurance fine print, etc.

sadiq 3 days ago [ - ]

Hyperspectral data is really neat though it's worth pointing out that TESSERA is only trained on multispectral (optical + SAR) data.

You are very right on the temporal aspect though, that's what makes the representation so powerful. Crops grow and change colour or scatter patterns in distinct ways.

It's worth pointing out the model and training code is under an Apache2 license and the global embeddings are under a CC-BY-A. We have a python library that makes working with them pretty easy: https://github.com/ucam-eo/geotessera

CrazyStat 3 days ago [ - ]

> If you’re a good guy, you’d leverage this data to empower farmers. If you’re an asshole, you’re looking to see who has planted your crop illegally, or who is breaking your insurance fine print, etc.

How does using it to speculate on crop futures rank?

Waterluvian 3 days ago [ - ]

Every time someone explains the way short selling or speculative markets work, I have a “oh, I get it…” moment and then forget months later.

Same with insurance… socialized risk for our food supply is objectively good, and protecting the insurance mechanism from fraud is good. People can always bastardize these things.

bluGill 2 days ago [ - ]

It is complex. I was going to write out how it works in a simple way that everyone could understand - but then I realized that even though it would be a gross simplifications that are unrealistic, it still would be so complex that people would go "yep I understand that to every step", and then finish and not understand it. Every step alone makes perfects sense and is simple, but the total quickly gets complex.

Even calling this a speculative market is a gross simplification of the truth.

wbl 3 days ago [ - ]

It is good to enable people to hedge against bad harvests.

bluGill 2 days ago [ - ]

There are two sides hedging against bad harvests, the farmer that grows the crop, and the industry (cattle, ethanol, food oils, and others) that buys that crop. The farmer wants to get paid, and the industry wants to get their crop.

sadiq 3 days ago [ - ]

Yes! TESSERA is very new so we're still exploring how well it works for various things.

We're hoping to try it with a few different things for our next field trip, maybe some that are much harder to find than brambles.

0_____0 3 days ago [ - ]

I've wondered this about finding hot springs.

avsm 3 days ago [ - ]

That's should be a pretty good usecase; if you do just a few labels manually of known hotsprings you should be able to find others quite quickly using the TESSERA interactive notebook. The embeddings capture the annual spectral-temporal signature, so a hotspring should be fairly distinctive vs the surroundings.

Video of the notebook in action https://crank.recoil.org/w/mDzPQ8vW7mkLjdmWsW8vpQ and the source https://github.com/ucam-eo/tessera-interactive-map