Hacker News new | past | comments | ask | show | jobs | submit login

author here: hit me up with questions you've got.



Any chance you would release your code for this? I'd love to run an analysis on some lesser known rappers and play around with some of the filters. Awesome project btw.

EDIT: The reason I ask is that I assume you don't have the time or desire to add every rapper every person asks you for.


the code is pretty straightforward, just chapter one of the NLTK book and a txt file.

nltk.org/book


Before making this study, what were your predictions? Would have have expected Wu-Tang and GZA to be near the top? What did you expect the average to be?

It would be very interesting to do something similar for rockers too.


I only have hip hop data, sadly.

I didn't expect wu-tang to be on the top since it would go across 11 different vocabs. I felt like the law of averages would basically play out.


an artist that meets the requirements for the data, AZ. He's got 5+ albums, some of which were gold and it'd be interesting as I believe he may also be the highest selling solo artist in the top 10 aside from Ghostface


Did you remove hooks/choruses from the artist's 35,000 word count?


nope. they are in there. if it's on the rap genius lyrics page, I included it.


Awesome work. Did you build a scraper to grab lyrics from Rap Genius?


What are Jay Z's stats?

[Edit] also Notorious B.I.G. :)


Jay Z's stats are there: he's at 4,506. Pretty middle of the pack.

About Biggie:"35,000 words covers 3-5 studio albums and EPs. I included mixtapes if the artist was just short of the 35,000 words. Quite a few rappers don’t have enough official material to be included (e.g., Biggie, Kendrick Lamar)"

And to respond to your child comment, I'm sure the same problem (not enough material) applied to Big L.


Also Big L




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: