I'd have to know more specifics to be able to comment beyond a certain level.
In general terms, yes, FPGA work can and does usually take longer than the equivalent work in the software domain. It doesn't have to be that way though.
For me it starts with language choices. I suppose that if you work in VHDL all the time you probably rock. I have an intense dislike for VHDL. I don't see a reason to type twice as much to do the same thing. Fifteen years ago VHDL had advantages with such constructs as "generate", this is no-longer the case. I realize that this can easily turn into an argument of religious nature, so we'll have to leave it at that.
One approach that I have used with great success with complex modules is to write them in software first and then port to the FPGA. Going between C and Verilog is very natural.
The key is to write C code keeping in mind that you are describing hardware all along. Don't do anything that you would not be able to easily replicate on the FPGA. You are, effectively, authoring a simulation of what you might implement in the FPGA. The beauty of this approach is that you get the advantage of immediate execution and visualization in software. Debug initial structures and assumptions this way to save tons of time.
Maybe the best way to put it is that I try not to use the FPGA HDL coding stage to experiment and create but rather to simply enter the implementation. Then my goal is to go through as few Modelsim simulation passes as possible to verify operation.
If you've done non-trivial FPGA work you have probably experienced the agony of waiting an hour and a half for a design to compiler and another N hours for it to simulate before discovering problems. The write-compile-simulate-evaluate-modify-repeat loop in FPGA work takes orders of magnitude longer than with software. I've had projects where you can only reasonably make one to half-a-dozen code changes per 18 hour day. That's the way it goes.
This is why I've resorted to extensive software-based validation before HDL coding. I've done this with, for example, challenging custom high-performance DDR memory controllers where there was a need to fiddle with a number of parameters and be able to visualize such conditions as FIFO fill/drain levels, etc. A nice GUI on top of the simulation made a huge difference. The final implementation took far less time to code in HDL and worked as required from the very start.
Another general comment. When it comes to image processing in FPGA's you don't really pay a penalty for modularizing your code to a relatively fine-grained degree. This because module interfaces don't necessarily create any overhead (the best example of this being interconnect wires). In that sense FPGA's are vastly different from software in that function or class+method interfaces generally come at a price.
Modularization can produce benefits during synthesis and placement. If you can pre-place portions of your design and do your floor planning in advance you can save tons of time. Incremental compilation has been around for a while. Still, nothing beats getting into the chip and locking down structures when it makes sense.
To circle back to the recurring theme of "FPGA for the masses" that pops-up every so often. I maintain that FPGA's are, fundamentally, still about electrical engineering and not about software development. These, at certain levels, become vastly different disciplines. Once FPGA compilers become 100 to 1,000 times faster and FPGA's come with 100 to 1,000 times more resources for the money the two worlds will probably blur into one very quickly for most applications.
Thanks for your insights, there is a lot of value for me in your post.
I have an intense dislike for VHDL.
I have yet to meet an engineer who likes it!
I hate it with passion, but it lets me write circuits in the way I want.
Luckily, emacs VHDL mode makes me type less.
If you've done non-trivial FPGA work you have probably experienced the agony of waiting an hour and a half for a design to compiler and another N hours for it to simulate before discovering problems.
My simulations never took hours.
I use GHDL (an open source tool that converts VHDL into C++) to simulate my code, which is much slower than running Modelsim in a virtual machine.
So I guess that you are working on much larger problems than I do.
I have tried using a high level language before writing my circuits in VHDL before.
But the results were not very good, apart from learning a lot more about the actual algorithm/circuit.
Either I coded at a too high of a level, which would be impossible in an FPGA (e.g., accessing a true dual port block RAM at 3 different addresses in a clock cycle), or I ended up simulating a lot of hardware just to make sure that it will work.
But the point is, no matter which approach I tried, it was painful, so I ended up choosing the workflow that is less painful.
I'd have to know more specifics to be able to comment beyond a certain level.
I am developing a marker detection system that runs at 100fps, with 640x480 8-bit grayscale images.
First I am doing CCL to find anything in the image that could be a marker.
At the same time, some features are accumulated for each detected component (potential marker).
Then the features are used to find which component is a real marker and what's its ID.
And finally, the markers have some spacial information that allows me to find out the position and orientation of the camera.
Even though the FPGA that I use is the largest of all Cyclone II FPGAs with 70k LEs, I have to juggle registers and block RAM because it's too small to store all data in the registers, and using up too many registers substantially increases the time to place&route the design.
I maintain that FPGA's are, fundamentally, still about electrical engineering and not about software development. These, at certain levels, become vastly different disciplines. Once FPGA compilers become 100 to 1,000 times faster and FPGA's come with 100 to 1,000 times more resources for the money the two worlds will probably blur into one very quickly for most applications.
I agree, and I would add that the compilers need to be smarter about parallelizing the code.
So while being able to perform better than the alternatives, the FPGAs are still a pain to develop for.
Even if the compilers are faster, and FPGAs are bigger, writing code for FPGAs feels still more like writing assembly code rather than code that is easily accessible "for the masses".
But I would be happy if the compilers become just 10x faster!
> I hate it with passion, but it lets me write circuits in the way I want.
Can you explain what you are doing. I am wondering if you might be making your work more difficult by not taking advantage of inference. Are you doing logic-element level hardware description? In other words, are you wiring the circuits by hand, if you will, by describing everything in VHDL?
I've done that of course, but I don't think it's necessary unless you really have to squeeze a lot out of a design. Where it works well is in doing your own hand-placement and hand-routing thorough switch boxes, etc. to get a super-tight design that runs like hell. I've done that mostly with adders and multipliers in the context of filter structures.
My guess is that you have setup several delay lines in order to process a kernel of NxM pixels at a time?
It's been a while but I recall doing a fairly complex shallow diagonal edge detector that had to look at 16 x 16 pixel blocks in order to do its job. This ended-up taking the form of using internal storage in a large FPGA to build a 16 line FIFO with output taps every line. Now you could read a full 16 lines vertical chunk-o-pixels into the shallow edge processor and let it do its thing.
The fact that you are working on a 70k LE Cyclone imposes certain limits, not the least of which is internal memory availability. I haven't used a Cyclone in a long time, I'd have to look and see what resources you might have. That could very well be the source of much of your pain. Don't know.
In general terms, yes, FPGA work can and does usually take longer than the equivalent work in the software domain. It doesn't have to be that way though.
For me it starts with language choices. I suppose that if you work in VHDL all the time you probably rock. I have an intense dislike for VHDL. I don't see a reason to type twice as much to do the same thing. Fifteen years ago VHDL had advantages with such constructs as "generate", this is no-longer the case. I realize that this can easily turn into an argument of religious nature, so we'll have to leave it at that.
One approach that I have used with great success with complex modules is to write them in software first and then port to the FPGA. Going between C and Verilog is very natural.
The key is to write C code keeping in mind that you are describing hardware all along. Don't do anything that you would not be able to easily replicate on the FPGA. You are, effectively, authoring a simulation of what you might implement in the FPGA. The beauty of this approach is that you get the advantage of immediate execution and visualization in software. Debug initial structures and assumptions this way to save tons of time.
Maybe the best way to put it is that I try not to use the FPGA HDL coding stage to experiment and create but rather to simply enter the implementation. Then my goal is to go through as few Modelsim simulation passes as possible to verify operation.
If you've done non-trivial FPGA work you have probably experienced the agony of waiting an hour and a half for a design to compiler and another N hours for it to simulate before discovering problems. The write-compile-simulate-evaluate-modify-repeat loop in FPGA work takes orders of magnitude longer than with software. I've had projects where you can only reasonably make one to half-a-dozen code changes per 18 hour day. That's the way it goes.
This is why I've resorted to extensive software-based validation before HDL coding. I've done this with, for example, challenging custom high-performance DDR memory controllers where there was a need to fiddle with a number of parameters and be able to visualize such conditions as FIFO fill/drain levels, etc. A nice GUI on top of the simulation made a huge difference. The final implementation took far less time to code in HDL and worked as required from the very start.
Another general comment. When it comes to image processing in FPGA's you don't really pay a penalty for modularizing your code to a relatively fine-grained degree. This because module interfaces don't necessarily create any overhead (the best example of this being interconnect wires). In that sense FPGA's are vastly different from software in that function or class+method interfaces generally come at a price.
Modularization can produce benefits during synthesis and placement. If you can pre-place portions of your design and do your floor planning in advance you can save tons of time. Incremental compilation has been around for a while. Still, nothing beats getting into the chip and locking down structures when it makes sense.
To circle back to the recurring theme of "FPGA for the masses" that pops-up every so often. I maintain that FPGA's are, fundamentally, still about electrical engineering and not about software development. These, at certain levels, become vastly different disciplines. Once FPGA compilers become 100 to 1,000 times faster and FPGA's come with 100 to 1,000 times more resources for the money the two worlds will probably blur into one very quickly for most applications.