Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm a Data Scientist. For some time, I've been working on a library for feature engineering. • GitHub: https://github.com/feature-express/feature-express • Website: https://feature.express It isn't yet complete, and I wouldn't consider it ready for production use or handling larger datasets. Here are some of its characteristics: • Event-based workflows: Initially, everything is converted to an event format, ingested into an event store, and processed from there. • In-memory: Both the event store and evaluation have been built from scratch. • Written in Rust, but there's a Python package available. • A DSL (Domain Specific Language) for defining aggregations, similar to SQL. Why am I developing this? I've always found it challenging to build models based on time. These models can be surprisingly tricky, and there's a high risk of accidentally using future data, which can lead to data leakage. FeatureExpress is designed to nearly eliminate such mistakes. Moreover, I believe that representing data as events is an intuitive approach.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: