georgehe9's comments

georgehe9 · 2025-11-05T16:36:16 1762360576

Hi I’m George, I’d love to share lessons we made optimizing data pipelines with AI / embedding calls for our users, which increased the pipeline throughput 5x. We did adaptive batching - discussed in detail how we did it. Developers still simply process data row-by-row, under the hood we queue requests and batch at the right moments (batching is effectively columnar), so no manual plumbing. Would love your thought.

georgehe9 · 2025-08-12T15:11:08 1755011468

Hi HN, I’m George — I left Google after 10 years working on infrastructure and am building CocoIndex https://cocoindex.io with my friend Linghua.

CocoIndex is an open source ETL framework that does incremental processing designed for AI workloads. We cut >90% of compute costs by processing only what’s changed — effortless fresh context for AI.

It is easy to build scalable, production-grade pipelines like Lego in hours. Think it as n8n with python blocks but for large scale RAG pipelines.

You can build vector index, knowledge graph and custom logic with any modality in the pipeline with AI. To get started, you can run `pip install -U cocoindex`.

We’ve built 10+ examples to get you started. If you prefer to read blog: https://cocoindex.io/blogs/tags/examples

If you prefer to read code: https://github.com/cocoindex-io/cocoindex?tab=readme-ov-file...

We’ve made 75+ releases since our last launch.

Looking forward to learning your thoughts! Best, George

georgehe9 · 2025-07-15T14:46:42 1752590802

This article offers a holistic, top-down perspective on Rust’s ownership, permission, and memory safety model. By rethinking Rust’s rules through this mental framework, it demystifies challenging concepts like lifetimes, Send/Sync, and interior mutability—making Rust’s safety guarantees easier to understand without memorizing a long list of rules.

georgehe9 · 2025-04-24T01:50:47 1745459447

Love the idea of writing data pipeline similar to spreadsheet. Spreadsheets is an amazing programming model and I can write my formulas brainlessly, and it calculate the result in order, and also automatically takes care of any updates.