I've ended up writing a fair amount of code to poke at different public bioinformatics APIs: GDC, GTEx, UCSC, ENA, PDB, gnomAD, and a few others. I needed to answer questions like:
- What do these endpoints actually return?
- Are their IDs stable?
- What does the error behavior look like on a bad request?
- How much variance is there across datasets that claim to represent the same thing?
As I kept doing this, it became obvious that these little experiments were slowly turning into a reference collection. So I pulled them together into a single repo:
https://github.com/eosin-platform/cyto-vendor-examples
The goal isn't to build a coherent library or anything polished. It's mainly to have a place where I (and anyone else working with Cyto later) can see:
- minimal working examples for common vendors
- what the real responses look like, not just what their docs say
- how each vendor behaves under simple tests (timeouts, partial content, redirects, missing IDs, etc.)
- enough examples to think about how a unified schema might map onto them
- how different vendors rank in terms of authority
This is the edifice for the unified protocol. If someone else wants to contribute a test for a vendor they know well, or point out something I overlooked, that would be great.
-Tom