PhD candidate in Beijing. I spend most days trying to make multimodal LLMs do something useful with InSAR data — radar interferograms, displacement time-series, the messy phase signals that drop out of orbiting satellites. It is a strange middle ground between remote-sensing geophysics and modern multimodal modelling, and I think there is more there than the field has noticed.
Before grad school I was a regular full-stack person; you can still find traces of that earlier life in older repos.
- Pretraining tokenizers that respect phase, so language models can read SAR
- Building benchmarks for vision-language models on remote-sensing tasks
- Trying to get satellites to draft their own figure captions
| Project | What it is |
|---|---|
| PhaseTokenizer | Discrete phase-aware tokenizer for SAR imagery, so autoregressive multimodal models can ingest radar tiles without throwing away phase. |
| insar-vlm-bench | A benchmark for vision-language models on four interferogram understanding tasks — coherence, deformation localization, fringe counting, noise attribution. |
| deformation-narrator | Library that turns an InSAR displacement time-series into a short, structured natural-language report. |
Multimodal Large Models InSAR & SAR Imagery Discrete Visual Tokenization Vision-Language Benchmarks Geospatial Foundation Models
A note on older repos
A lot of what's pinned on this account is from earlier years — game engines, web frameworks, the usual undergrad sprawl. Some of it I'm still proud of, some I'd rewrite from scratch. I leave it up because I think a public history is worth more than a curated one.