Hacker News new | past | comments | ask | show | jobs | submit login

Sure. I have a SaaS, EasyALPR.com that uses ALPR for parking enforcement that I've been working on for a few years.

I'm near a major release actually, replacing my previous products with something called Parking Enforcer, which I believe is the best mobile app / vehicle grouping tool in market. I focus on business parks with 300-2000 parking spaces to patrol. It has been in beta for about 9 months.

I have been working algorithms related to ALPR data set matching a lot, primarily in Python.

I'm not familiar with storm / spark. However, one issue is that that license plate reads are not 100% accurate. So you are looking for a fuzzy version of the plate.

Its possible for collections of plates to sort of fuzz-out as incorrect members become more distant from previous ones, causing new matches to join groups they should not. You can write stuff to handle this but the original point was ~"this problem is in analysis not ALPR" which I agree with.

As far as computation, a new plate may or may not have a group to join. If it has been seen before, you need to look for the group "most likely" to be the same car. This can mean iterating over a large data set to look for the highest probable group.

There are some tricks to cutting down on the data set under consideration. For example it is much easier to only look at possible sightings from this past week in this area than every sighting from everywhere. (which when assembling a national database may be necessary)

But in my experience, even in the tens of thousands of plates, doing grouping requires task queues and tricks for quick identification and notification of matches. A watchlist (blacklist) is might be short, but the grouping task over months of data can be long.

Some plates are VERY similar but not the same car! Sometimes the ALPR camera forwards photos of fences or vehicle grills that are not license plates at all, and those must be ignored but not at a threshold where good data is thrown out. When bad stuff makes it to the user, they have to be easily disposed of if they make it through.

Data cleaning things can require additional capabilities of analysis, like breaking apart improperly grouped vehicles, yet ensuring they do not rejoin "bad" groups. Lots of details to handle, and to an end user, it is painfully obvious when "matches" are not correct. They expect magic.

One other note is that I found building an efficient and useful model architecture for these purposes to be challenging. There are more details in how the raw data comes in from ALPR scans that have to be handled before you even reach an individual "Sighting."




Question, from a technical perspective, how much of your fuzzy/unclear plate problem do you think will be solved by advances in cameras? In the context of a car with 4 to 6 cameras mounted on it roving a parking lot checking plates. Already the video output from the hdmi port on a GoPro hero7 black is considerably better than what a $4000 camera could do just 3.5 years ago, and it costs $349... Or various setups I have seen with modified 22 megapixel Sony mirrorless cameras used for GIS orthophotos from drone platforms.


Hard to say. FWIW, much of the time the exact plate gets read. But it also matters what has been trained. Some states have weird vanity plates. So it not purely an optics problem.

If I could I would have all my customers scan plates with the iPhone XS. But many work with 7 or even less.

Also introducing a lot more collected data to dedupe is a thing as well. One of the first algos I worked on was just realizing in short term new dats we already had a car and to not try and treat the same car like it was another car.


Are there ALPR companies with a workflow where the truly unrecognizable plates go in a review queue for image recognition by a human offshore somewhere, in a low cost location? I'm thinking of the standard call centre salary for people with an average level of education in second/third-tier cities in India, in Pakistan, in Bangladesh, etc.


It that I’m aware of. But it is certainly possible. There is just a lot of data.


There are some tools that focus on this “fuzzy deduplication” problem:

Senzing https://senzing.com/

Tamr https://www.tamr.com/


Thank you I will look at these.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: