Hacker News new | past | comments | ask | show | jobs | submit login
How FPGAs work, and why people will buy them (2013) (embeddedrelated.com)
207 points by snaky on Sept 24, 2018 | hide | past | favorite | 118 comments



> FPGAs are a programmable platform, but one designed by EEs for EEs rather than for programmers.

This is the problem with FPGAs. Their performance and utility is not really disputed. Working with them is so un-ergonomic that it's frankly embarrassing.

The EE world is notoriously closed/proprietary making it incredibly difficult to explore novelty or customize tooling to suit specific needs. The situation is reminiscent of the pre-gcc era proprietary C compilers. Sure synthesis / place and route are much harder problems than compiling to machine code but I would guess that fact amplifies the need.

One can't really expect software's current development comfort without acknowledging its driving force (GNU/the free software movement).


The hard part AFAICT, is the modeling of the chip for timing analysis. Whereas all of the timing information is explicitly defined for programmers on CPUs, that timing information (and the models of how FPGAs are binned) is extremely proprietary for FPGA vendors, to the point of being treated as a trade secret. Since they use custom cell layouts, I've heard that they consider the timing information core to their IP. If another vendor got them, the could conceivably see what interesting points are optimized past a foundary's standard cells because of the tighter timing guarantees than they might should have.

Now, I'm personally of the opinion that their competitors have taken their chips into one of those nifty 3d xrays and know the exact layout of the cells anyway, but it's hard to push that idea.


I saw some good reverse engineering of FPGA's like this work:

https://vtechworks.lib.vt.edu/bitstream/handle/10919/51836/S...

It has a lot of details on things that are clock-related. Since Im not hardware guy, I can't ssy if it has info needed, if another work could get it building on this, or what. So, what you think?


It's not quite what I'm talking about, and what he has listed there is generally pseudo publicly documented. Xilinx for instance, more or less supports programming at the bitstream level rather than the HDL level for stuff like runtime reconfigurable blocks. So there isn't much there in his thesis that isn't covered in the docs.

What he covers (for the purpose of this discussion) is how the clock blocks (straight up PLLs if you've done any embedded development) are connected to the rest of the fabric. That's sort of the digital side, and the hard part is the analog side. Ie. "what are the propagation delays of different configuration of LUTs and fabric, and who can you play games with in place and route in order to meet timing". It's an important feedback step in place and route. You can get neat trivial stuff running, but without decent timing analysis it'll be hard to tell if your design will have weird bugs at runtime. Or will lack optimizations that will help you meet timing with an equivalent design. I mean, that stuff is almost a black art even with high quality timing analysis.


But if those (timing) models are incorporated into current proprietary synthesis systems, can't they be reverse engineered with relative ease, anyway?


I too decry the lack of openness with respect to the data file encoding which would allow for more open source tools to be created. I had a hilarious discussion with Xilinx's VP of tools about this[1].

The interesting thing about HDLs and HDL work flows is that they can "look" like software and yet not be software. VHDL and Verilog are the only "languages" I know where you can write something that is both syntactically correct and cannot by inferred into logic by the synthesis part of the tool (equivalent of the code generator in a compiler). I am not aware of any equivalent to the Turing definition of computability which would prove for any legal construct in language X there was an implementation of synchronous logic Y that could implement it. Most of the bugs are easy to avoid once you know them though, things like a register value being assigned two different values in the same process block.

That said, I've been playing lately with an "Ultra96"[2] board which is pretty freakin' cool. quad core 64 bit ARM CPU and a nice chunk of FPGA fabric to play with as well. I think it can be the basis for a pretty sweet SDR setup.

[1] His assertion was that things had to be hidden so that they could protect the value of the software, when I countered they could put a 5% tax on every chip they sold and allocate it to software which would give his team more money than it had today, he argued he would lose sales to cheaper FPGAs, and I asked how the people trying to sell C compilers were holding up, and would anyone buy chips if they had to pay extra for their tools when a top chip maker gave away their tools for free? And then he said it needed to be proprietary to keep the quality up, and I went back and asked what C compiler he used, and he admitted they used GCC (when they were compiling for Petalinux etc) and I asked why they didn't use a proprietary C/C++ compiler? And he said they didn't keep up with the standards and gcc generated just as good code generally as the proprietary ones did. He was left with "just because" as his only rational for not making all of the documentation freely available and using the cost of the chips to leverage the tools cost (which I told him would go away as soon as the open source community had caught up with and surpassed Vivado)

[2] http://zedboard.org/product/ultra96


verilog has behavioural and synthesizable subsets. One is intended for testing, the other for hardware generation. Once you understand this you can learn which constructs, though syntactically correct, should not be used for hardware generation.

Also HDLs are not alone in having 'quirks' that experienced engineers need to know about.. c/c++ for example. cough undefined behaviour cough

Code coverage and quality checking tools are quite good in the EE world, verification has some powerful tools to almost eliminate hardware bugs - which is especially important for asic design.

On the open sourcing of tools - there are free ones, that target real FPGAs - so can you say why these have not surpassed vivado? In fact, while impressive in their own right, they are very primitive in comparison to vivado despite being open source for years. The open source verilog tools also don't fully support all of verilog 2008/2012.

So I don't buy the open sourcing argument for the synthesis and pnr tools would dramatically affect FPGA sales. Instead higher level compilation and abstraction may be the key.


1: verilog started as a simulation language.

2: if the tools were open source, people would be free to improve on these bags of pain that we have the pleasure of spending thousands of dollars per license.


So you not only want open source hardware specs but open source existing tools? I don't see intel open sourcing icc, or other software devlopment centric companies open sourcing their IDEs - ao why shoukd FPGA vendors? The original point was allowing open source tools to be developed like gcc, and the response was that it has not happened despite some ground up tools do exist but are very primitive.


I'm not sure that the OP was stating that it is a requirement that the FPGA manufacturers open source their tooling, only that the devices be well-documented (without NDA requirements) so that someone could have an opportunity to do so without the imperfections that reverse-engineering a production can entail.

To your example, 'I don't see Intel open-sourcing ICC'; while it would be really nice if they did[0], it's not a requirement in the same manner that it wasn't a requirement for gcc to exist.

I could be wrong here -- I do not develop for the FPGA space, however, I've run into similar problems all over the embedded space. Try developing something on one of ARM's Secure MCU products that lands comfortably in just the "open-source software" category (skipping hardware all-together). It's...tricky. To get details on the design of the security features of these products, you have to execute multiple NDAs. And this is in a security space where openness is considered a security feature. In theory, at least, if you interact with one of these NDA-protected features, publishing the source code might be a violation.

Unfortunately, I suspect that many of these features are as good as the secrets that are kept[1] -- exposure of the documentation would likely yield viable attacks[2].

[0] Not the least of which would be to be able to port some of the optimizations that icc enables for Intel processors but disables for AMD/others to be able to be used on...AMD/others.

[1] To clarify, I have not signed any NDAs with ARM, so this is entirely speculation. I'll be a party to one, shortly, so I won't be talking on the subject assuming -- as I suspect -- that doing so would run afoul of the NDA provisions.

[2] At some point the hardware world will learn from Intel and others that security through obscurity ...isn't. As with Intel, as far as we know, the issues they experienced with their management component existed for years without breach. The vulnerability was shockingly bad, was almost certainly known by adversarial governments and black-hats, who kept it a guarded secret as carefully as Intel kept the details of their management component secret. So they succeeded in keeping attackers in business and customers in the dark ... making everyone feel secure.


Languages can have undocumented behavior, that's fine. But if the HDL compiler compiles code that should have a defined behaviour incorrectly, that's just pure frustration. I swear I've lost so many days just trying to find a way to rewrite pieces of logic so that I could find a variation that Vivado would compile according to the HDL spec.


One of the things to realise about HDLs is that they describe hardware - real hardware is non-deterministic, there are race conditions, clock crossings, metastability etc etc

It's honestly not possible to have a "defined behaviour" in all circumstances - verilog simulators don't define event order, if you depend on them stuff will break (we all fought that battle 20 years ago), more importantly you kind of hope stuff breaks to indicate you might not be building designs that work on real hardware


Can you give example of a verilog code snippet vivado actually mis compiles? I've found it incredibly reliable and often use it to double check other tools' results.


> where you can write something that is both syntactically correct and cannot by inferred into logic by the synthesis part of the tool (equivalent of the code generator in a compiler).

A decade ago, programming GPU shaders was similar to that. You could use a higher-level language but there were tight limits on everything, instructions, branches, texture lookups, etc. Here’s a summary for GL, DX was the same because they were hardware limitations of corresponding GPUs: https://stackoverflow.com/a/5601884/126995 Modern GPU chips are way more capable so that’s mostly history now.


Take a look at the LimeSDR Mini device [1] which includes an Intel MAX 10.

[1] https://wiki.myriadrf.org/LimeSDR-Mini_v1.1_hardware_descrip...


Now compare that FPGA with the Xilinx ZU3EG [1] (used on the Ultra96). I have been looking at the USB 3 support on that processor as a means of driving a regular LimeSDR board. Even better would be pulling 4x PCIe[2] lanes of the Zynq and driving an M.2 form factor XTRX or something similar to that. For my purposes I want to insure the SDR has MIMO capability.

[1] https://www.xilinx.com/support/documentation/selection-guide...

[2] And yes I know I would need a ZU4EG rather than the ZU3EG that is on the board. If I was doing my own board I'd use the 4EG and add some extra memory attached right to the FPGA fabric.


The arguement in [1] doesn't make sense to me. Xilinx tools are free, as in beer, and the sale of ICs funds those tools so they are implementing your system of a "tax on every chip". Surely the VP of tools at Xilinx knows this.

What problem are you unable to solve with an FPGA because you do not have the source code to Vivado?

I understand why Xilinx doesn't want to release their source. They don't want to support it. Supporting code is a huge overhead and it's not clear, to me anyway, what Xilinx gains from the expendature.


My argument for the VP was essentially not to release their source, rather to release all of the details that you need to create, sign, and then load a bitmap file into their FPGAs. Also to document how the chip is laid out, and how you can floorplan it by hints in the .bit file.

If they did just that, then the source code would "appear" as people wrote back ends for the various open source HDLs that are already in existence.

The challenge that I see is that Xilinx has already had the experience of selling that data to vendors like Synopsis who sell their own synthesis tools and charge major money for them. And like a person who holds an unvested stock option for a stock that goes from price A above the strike price, to a price B that is below the strike price, they feel as if they "lost" money. Similarly, Xilinx can't see "giving up" thousands, if not millions of licensing dollars they are getting from tools vendors, just to enable an open source community to start. Because they fundamentally can't see that having a vibrant open source ecosystem benefits all players. And this in spite of the gcc example which is pretty incontrovertible in my opinion.


They do document how the chip is laid out and exactly what HW resources the chip has. If you want to make your own backend you can. Use the get_ set_ property of physical constraints, extract and redo placement on the entire design via TCL, or save to your own intermediate file and work off that. I don’t think one needs access to Xilinx’s binary formats.


> Xilinx tools are free, as in beer

This is untrue. [1]

> Supporting code is a huge overhead and it's not clear, to me anyway, what Xilinx gains from the expendature.

In what world is this the case? You host the project on github or the like and let people contribute bug fixes at the cost of filtering pull requests. Letting the community contribute bug fixes is a huge reason companies open source their tools.

[1]: https://www.xilinx.com/products/design-tools/vivado.html#buy


From your link: Vivado HL WebPACK™ Edition: no-cost, device-limited version of the Vivado HL Design Edition

It's not as simple as an upload to github. https://opensource.com/business/16/5/how-transition-product-...


The keywords are "device-limited". Xilinx WebPack won't build bitstreams for some of the larger and faster devices.


It is deprecated: it won't generate bitstreams for any new FPGA.


You're mixing up WebPack (which is a licensing plan for some of Xilinx's tools) and Xilinx ISE (which is one of those specific tools).

Xilinx ISE is indeed deprecated. There are quite a few parts in production which it will still generate bitstreams for, though.

Vivado is the newer replacement. It will not build designs for parts older than 7-series, though, so Xilinx ISE is still required to work with 6-series and older parts, as well as with Xilinx CPLDs.

WebPack licenses are not deprecated. The WebPack program is still active, and will generate limited licenses for both Xilinx ISE and Vivado.


You're right. I wasn't aware that the WebPack licensing program is for Vivado as well. Thanks for the clarification.


Most EE EDA/CAD tools are ancient monstrosities of patchwork with a user experience reminiscent of using eclipse 0.01. Full of bugs, lack of fast CLI tools. Everything must start a core process that takes many seconds to even boot. It's ridiculous. Software engineers don't know how good they have it in terms of tools.


I started off doing digital design but quickly switched to embedded software after seeing the state of tooling. It's not just the tools themselves either. There are folks who will vehemently defend the way things are and shoot down even the slightest improvement efforts as naïve. With that culture in place, I'm happy just having someone else slap a cortex-mX on a board and programming it with gcc/makefiles/openocd.


Being able to do that for embedded software is also a sign that things are changing. For the longest time embedded devices were only programmable from a vendor supported IDE (looking at you TI and Cypress). At least now the open source community have figured out how to get around the limitations and the tools are starting to flourish with increasing vendor support.


Actually I recently bought a Spartan 7 based FPGA board after taking the nand2tetris course and started playing with the free version of the Vivado suite.

If your complaint is principle based on the stuff being non free software, then it holds. If it's on usability - not sure it does. While lengthy, the process of compiling and getting something running on silicon didn't seem any more complicated than Grade/Maven etc all based Android Studio builds.


> While lengthy, the process of compiling and getting something running on silicon didn't seem any more complicated than Grade/Maven etc all based Android Studio builds.

I'm not sure many people would share your opinion that Gradle/Maven are uncomplicated.

Personally, I would say they belong to the 20% most complicated build toolchains I have encountered - which would make the FPGA process still not very uncomplicated in comparison.


> Personally, I would say they belong to the 20% most complicated build toolchains I have encountered

I actually agree which is why I used that as a comparison. Over a million developers deal with it every day of their work lives and the FPGA tooling I dealt with did not seem any more complicated than something that's very mainstream.


Maven is just right click + build to build. Or file + import to import a project.

The last time I worked with Vivado, I remember our interns couldn't manage to create a project or use an existing project after a whole week. It doesn't help that there are no tutorials from the vendor or the internet, or that projects can't be stored in source control.


I went on an interview once where part of the process was "here is a development board. Here is a computer and the internet. Make a full adder, connect the inputs to push buttons and the outputs to LEDs. You have all day, by yourself, in a cubicle."

I was told it cut out something like 95% of candidates who looked good on paper. It was an interesting approach.


That's what the interns were trying to do. Get a push button to a LED working, on a Xilinx SoC, with the Vivado IDE. They couldn't figure it out in days.

I don't blame them. I tried and couldn't get it working in a day either. We were lucky we found another guy in the company who worked with that board on a different project. He showed us and he had extensive notes to get the environment working.

Embedded development really sucks.


Not all embedded. Thankfully there is a sort of standard embedded mcu arch showing up via arm (would love risc v but I'll take what I can get). This has support via mainline clang/llvm and gcc, and isn't much work to get a more modern setup going via cmake, c++17, and clion.

But the amount of people who are still going the raw c route for often times unfounded reasons (c++ is slower or has more bloat), prefer one large c file, don't do code based testing, etc, in embedded is staggering.

And for more niche areas (like FPGA) it's extremely behind the times. Not to say c is behind the times, it has its time and place, but 99% of the time they aren't bale to properly articulate it.


Ironically enough, a lot of the people I've cut in the interview process seemed like working their way through Vivado, Lattice, etc. wizards was the only development they had ever done. They could get an env setup, but would be completely lost outside of a trivial hello world style project.


Yeah, we had several interview questions specifically to weed those out too. A favorite was, "How would you design a multiplier, without a wizard/Coregen?"

They didn't have to hit everything, but a couple of general multiplier design points - throughout and area? A MAC, or a straight multiplier? Shift add? Do they know enough to try to hit a certain primitive (DSP48E, fast carry chains?).

The wizard-only designers frequently couldn't even understand the question, and were incapable of discussing tradeoffs of area/time/resources.


I'm current a CompE student and I was wondering what exactly do you look for in new hires, in terms on knowledge and experience?


We use an internship program - the best indicators seem to be a lot of undergrad courses in digital design (like 4 or 5 of them), the ability to talk about a project you did that used digital design and what your process was and what you learned, and then some small coding or design problem everyone should have seen. "Build a block that outputs a 1 when it sees the pattern 1100100" - you should know what a finite state machine is, combinatorial logic is, denouncing switches (not because it's hard, but because it means you've been in a lab!), etc, and if you're from a good program you should have at least heard of some of the more fundamental elements behind synthesis, place and route, timing; you should know what a UART or SPI is, and have implemented it at least once.

What I try to do is give candidates a bit of a hand - don't remember UART? What about SPI or I2C? Don't remember either? Ok, how would you design a communication block that looks like this (SPI).

I ask fundamental questions- what's synthesis? What's place and route? Explain timing to me. What underlies the logic minimization at the heart of FPGAs? There are a lot of questions to ask - I try to keep moving if a candidate gets nervous so that I can focus in on something that they can answer. For a lot of young people, especially tech types, if you can get them "on a roll" with something they did or were excited by, their anxiety will drop away and they'll do better. Generally we want to see what you know, but also how you think.

For more senior candidates, I take a similar approach - but I'm trying to identify weaknesses. So instead of a glide path to another problem, I'll use your answer to identify a weakness. Gave a marginal answer about something? Ok, let's see how they handle a hypothetical about pipelining, clock domain crossings, timing violations, block level design, proper reset usage, etc. We have a few questions about math implementations, timing violations, etc that are in our stable as well that have proven to be real destroyers.

There is a fundamental understanding about space, memory, and time that a good candidate will be able to grapple with that a marginal one won't - "Design two versions of this block - a small one, and a higher performance one, in broad strokes. Discuss the difference between them". - Area, throughout, latency.

Also: Show up on time. Do your best. Stay positive. Admit what you don't know. Ask questions when you're stuck. Try to learn something while you're there.

If you can find someone in the industry that is willing to do interview prep with you, do it. When I just got started, somebody sat with me and did mock interviews. It helped me a lot, so I try to pay it forward by doing that for my interns to help them get into the industry.

It's a great time to be in the game. Learn all you can, develop your passion for it, and you'll do great.


I really appreciate the in-depth answer.


Maven is that simple when someone have gone through all the pain of setting things up - but someone have to do that too - it is a battle for all but the very simplest of projects.


Maybe for personal projects, but I find I spend more time dealing with vivado bugs than I do vhdl bugs. I spent four hours today just getting a simulation to run correctly in 2017.3 upgraded from 2016.1. My Co worker was simultaneously dealing with a bug where deleting a net in ECO was silently deleting other nets, and not necessarily causing an error. EE tools are the worst.


First rule of FPGA development: never ever upgrade IDE in the middle of the project. Even if it’s a move from 2016.x to 2017.y version it can cost 2 months of precious time just dealing with some nonsense migration stuff/broken projects/broken settings.


For basic implementations in ways intended by the vendor, I'm sure the supplied tools are fine. When you want to leave the sandbox and do interesting things, it's hard. I was thinking along the lines of writing your own tools that would require information about timings and chip resources/layouts.

EDIT: your cpu ISA is open, but fpga layout/bit-stream is not--locking you to your vendor's innovation and ideas for how things should be done.


What does leaving the sandbox mean when implementing an FPGA?

I'm not aware of any soft resource one can't completely control in Vivado/Xilinx as every hard resource can be completely described in your own design via the primitive library. That doesn't mean it is always a good idea, but one can do it.


> described in your own design via the primitive library

There's your sandbox

EDIT: If you cannot replace their tools with yours, it is a sandbox.


The elements in the primitive library directly map to the hardware components available on the FPGA. There's no "sandbox" involved.


So when you say primitive components, are you talking about directly manipulating the components in a logic block (https://www.xilinx.com/support/documentation/user_guides/ug3...), or do you just mean the abstractions above that like "blockram", "LUT6", "shift register"? Because you can do a lot more with an FPGA's logic blocks than indicated by those higher level abstractions.


The relevant documents here are the "Virtex-6 Libraries Guide for HDL Designs" and "… for Schematic Designs". (There are similar documents for other part families.) These document all primitives available in the Virtex-6 family, including:

• Individual LUTs, with optional local or dual outputs (LUT1, LUT1_L, LUT1_D, … LUT6, LUT6_L, LUT6_D)

• Flip-flops (FDCE, FDRE, etc)

• The carry chain, which can be instantiated directly (CARRY4), or as its individual elements (MUXCY, XORCY)

• Shift registers implemented in LUTs (SRL16E, SRLC32E)

• Dynamically configurable LUTs (CFGLUT5)

https://www.xilinx.com/support/documentation/sw_manuals/xili...

https://www.xilinx.com/support/documentation/sw_manuals/xili...


Lots of engineers manage to do interesting things with the vendor tools.


Mhm. I am aware.

People did interesting things chipping away with rocks as well.


> didn't seem any more complicated than Grade/Maven etc all based Android Studio builds

Can you say anything nicer about it than that? I've used Make, ant, and cmake, I've built and packaged multiplatform GUI apps using Eclipse RCP, I've built modern web apps using npm and babel, and building Android apps was the single worst experience out of all of them. Comparing a toolchain to Android sounds more like a takedown than a defense.


Gradle and maven aren’t the posterchild for easy development setups- I would instead look at a successful dynamic language like python, php, or the web with JavaScript.

That being said, I’m also not sure the issue is entirely developer ergonomics. Personally I think the fact that the hardware is something you have to go out and buy versus just coming stock in your computer is more of a factor in developer onboarding, and the resulting mindshare.


One exception, the ICE range from Lattice which have a open source tool-chain

https://en.wikipedia.org/wiki/ICE_(FPGA)


AFAIK that tool-chain is thanks to the reverse engineering efforts of an individual and not Lattice itself. It's a step towards the right direction, but not really a first class vendor supported tool.


That's where GCC was circa 1988.


> The EE world is notoriously closed/proprietary making it incredibly difficult to explore novelty or customize tooling to suit specific needs.

The EE world could use novelty tooling to increase the efficiency by which you create RTL. But that can easily be done today if you consider Verilog to be an intermediate representation.

So I think you're talking about the tooling on the steps to go from RTL to bitstream? What kind of novel tooling do you have in mind?

In Quartus and ISE, once you've set up the pin assignments, going from RTL to a bitstream is a matter of a single click.


Xilinx recently created (then revoked) an open-source tool called "RapidWright" that might give you an idea -- it was basically a big Java API for manipulating and controlling Xilinx DCPs/bitstreams, allowing for some interesting tools. They plan on re-releasing it[1], but you can still see the docs here:

  http://www.rapidwright.io
The most interesting part are the tutorials, where they automatically e.g. insert ChipScope ILAs into existing, routed designs, which can only be done with some knowledge over the routed/placed results. (The automated UltraScale+ SLR-crossing example is probably more _interesting_ from a technical POV, but admittedly the ChipScope example was something I wanted recently!) Granted, that's not exposing the level of detail you'll need to write a full place-and-router or anything, but post-route/route-assistance based tooling seems like an interesting possibility.

Honestly, though, I'll take absolutely none of that if an open source tool means you can just fix some of the bugs in the damn things.

[1] I assume they revoked it in the first place due to some outbreak of batshit insanity in a legal department somewhere -- presumably some of the source code should not have been released, or someone's mind changed, or something.


There's potentially tremendous value in having an open bitstream.

To get the most out of hardware, you need feedback on how many LUTs each part of your design uses, where the critical dependencies are for the achievable clock rates, and so on.

It's difficult to get this information in a usable way in practice today even when you're writing Verilog. Using Verilog as an intermediate language makes the problem significantly worse.

Imagine there was the equivalent of LLVM for FPGA development. Without open bitstreams, that's never going to happen. (Even with open bitstreams, it'd need tremendous effort to get there, but at least there'd be a chance.)


Why do you think this information is unavailable today?

I can get the critical paths, device utilization, power, etc. from every vendor's software suite I'm aware of. And they are all fully scriptable from the TCL interface. Once I have a mostly stable design, I usually run Xilinx/Vivado from the command line. Same with Lattice. The reports the vendors provide are much better than what would get from the raw bitstream because it has all the symbol information. What you're proposing is akin to decompiling c from object code.

Also keep in mind there is huge variation in architectures between vendors, products, and the various product categories. For example, few things LUT based anymore. Now we have macrocells, "logic elements," and slices, etc. and that's just the soft stuff.

People have been working on implementing FPGA in higher level languages for years. Even LLVM to RTL has been tried a few times. I've observed matlab to RTL is starting to catch on in the DSP/control-system crowd.

https://en.wikipedia.org/wiki/C_to_HDL


LUTs are still the building block of all FPGAs. Logic elements are just a higher level of hierarchy.

For example, according to the Intel Arria 10 handbook, an ALM contains 2 4-input LUTs and 4 3-input LUTs, which can be combined in various ways. (See figure 7 of the A10 handbook.)


> To get the most out of hardware, you need feedback on how many LUTs each part of your design uses, where the critical dependencies are for the achievable clock rates, and so on.

In practice, if you code your Verilog with speed in mind (limited levels of combinational logic, plenty of pipeline stages, one-hot encodings, ...), the synthesis and fitter tool will do an excellent job of extracting high speed. The modern ones will do register clone, pull in registers into RAMs and DSPs etc.

In case of timing problems, the timing analyzer will already show you the critical path, the amount of LUTs, and how the critical paths is routed.

The hardest part of figuring out why a particular path is violating timing is not because of the reporting of the tools, it's because a blob of logic has been flattened and merged into a LUT, and it's hard to identify how a LUT corresponds to the RTL.

Maybe a custom tool can help debugging these kind of things, but I think it's a very narrow case, and IMO not the kind of thing that makes EDA tools unergonomic.

> It's difficult to get this information in a usable way in practice today even when you're writing Verilog. Using Verilog as an intermediate language makes the problem significantly worse.

That was not my point.

My point is: if your goal is to make FPGA design more ergonomic (whatever that means), there's a much higher !/$ in improving the front-end design process than the back-end.


> The hardest part of figuring out why a particular path is violating timing is not because of the reporting of the tools, it's because a blob of logic has been flattened and merged into a LUT, and it's hard to identify how a LUT corresponds to the RTL.

... and we have no realistic hope of figuring out whether maybe the tools can be improved to help with that, because the tools aren't open :)

> if your goal is to make FPGA design more ergonomic (whatever that means), there's a much higher !/$ in improving the front-end design process than the back-end.

And my point was that it's impossible to do Pareto improvements as long as the back-end is a black box. Like, you could come up with a better high-level language for the "front-end", sure. And that new language may be more convenient for writing a design.

But if the way you synthesize this new language is via Verilog, then those nice features you mentioned, like for showing the critical path, will de facto be lost -- because the closed tools will show you the critical path in the intermediate Verilog. And if you've ever actually worked with HLS tools, you'll know that this intermediate Verilog is totally opaque unless you spend a very large time understanding the translation tool.

The new "front-end" solution will be better in some respects, but certainly worse in others, and the only way to get improvements across the board is if you have control over the whole "stack".

So I think you're making some good points, but I feel like you're missing the larger point that a truly open ecosystem is desirable precisely because improvement may come in places and from places that you would never have considered.


From what I understand, chips are now laid out by an optimizer instead of by humans. The humans design an adder, but the software picks where to place it and how to arrange the gates.

I took a couple undergrad classes in CE, which makes me dangerous, but can we be that far away from building a macro system to convert small bits of imperative code into FPGA logic?


Naively translating imperative code to run on FPGAs won't do much in the general case. It takes time to get data to and from the FPGA so whatever you're doing on the FPGA has to be time consuming enough to make up for the time spent shuffling data around. FPGAs tend to excel at things that are highly parallel (e.g. routing) or streaming applications (e.g. audio/video codecs), so most applications don't have obvious ways to benefit from an FPGA.


I'd also throw in "things that have precise timing considerations" to the list. Like driving an LED display, etc.


> The EE world is notoriously closed/proprietary

See the IEEE.


Can you elaborate?


IEEE tracks a vast number of open standards and specifications (e.g., Ethernet, WiFi, Bluetooth). I guess the comment was pointing out that there is a degree of openness in EE.


   > IEEE tracks a vast number of open standards
And demands mega-bucks for a PDF of any of those standards


So that organizations with money that benefit from the standards pay for their creation. For everyone else, there's scihub.


The few university libraries I have been to had online subscriptions to all the IEEE publications so any standard or paper could be downloaded by anyone who walked in to the library and sat in front of a computer. Not as convenient as downloading from your home and office but as freely available as any book in the library.


So how do you propose a non-profit like the IEEE make money?


Collecting $ from member orgs, and not from hobbyists who just want to tinker? How else do you propose to raise the next generation of hackers? By charging them $6000 to simply see a spec they have a minute interest in tinkering with?


Standards bodies usually give hobbyists access to documents if asked, but with strings attached, of course. In other words, they need to somehow prove that you will not be using the spec in a commercial setting. This was my experience with the TCG, for example.

As for the IEEE 802 standards, it seems like individuals can get full access to all specification documents that are 6 months and older via the IEEE GET program: https://ieeexplore.ieee.org/browse/standards/get-program/pag.... That sounds pretty fair, no?

In general, standards groups provide access in a similar way to software companies that give out free (or low-cost) access to their tools for open source developers and university students.


I agree the EE community as whole is not bad. I would say though companies that hire EEs tend be more secretive than mainly software companies.


In case there is some interest in learning hardware

    iCEBreaker FPGA
The first open source iCE40 FPGA development board designed for teachers and students.

https://www.crowdsupply.com/1bitsquared/icebreaker-fpga

full disclosure: I will likely get an enthusiastic "Thank you!" for dropping this here.


I've been considering getting an Arduino MKR Vidor 4000, which has an integrated FPGA and is fairly new. Does anyone here have any experience with this as a learning platform for FPGAs? (I already have reasonable experience with Arduino) Any suggestions for alternative boards/tech?


Whenever people ask this question, I'm the broken record with the same reply: start out with the cheapest kit possible, an Altera EP2C5TQ144 development board.

http://land-boards.com/blwiki/index.php?title=Cyclone_II_EP2...

They are old, and you need a slightly older version of the Altera Quartus software suite, but they are dirt cheap.

Go to eBay, search for EP2C5TQ144, and you can find development kits including programming cable starting at $3.50.

The FPGA is tiny, but it has quite a bit of RAM, a bunch of hardware multipliers, and a sufficient amount of logic gates to create your own 32-bit RISC processor if you feel like it. And because it is so tiny, full synthesis and place & route run will take a few minutes.

You may be tempted to go the open source route with a Lattice ICE40 FPGA. There are now tons of board out there with this FPGA, and being fully open source is great. But once you start debugging, you'll be without all the goodies that come with Altera, especially SignalTap (an on-chip synthesized logic analyzer.)

Chances are that you will never outgrow the EP2C5 FPGA because you'll lose interest and move on to something else. But if you outgrow it, all you've wasted was one latte at Starbucks.


I'd advise people to start with a Cypress PSoC MCU, they have the taste of FPGA, but have a lot of useful stuff for various projects, and I bought 10 MCUs for $10.


I just googled it and they seem to be MCUs with no programmable logic?

What feature gives them a taste of an FPGA?



That's much more FPGA-like than I expected.


You have a very small number of blocks of analog and digital functions that can be connected together in various ways. Cypress are not open source, but the tools are better than average (FWIW).

FPGA's are a pain because you get a gigantic quantum jump in complexity right away. For example--a system clock. Do you want a single one? Okay, now you have to deal with distributing that clock and dealing with termination and reflections. Do you have more than one? Okay, now you need to deal with synchronizers right out the chute.

For most hacker types, the Cypress chips are a MUCH better match to what they need to do. Generally these digital tasks break down into something like "Catch a very tight margin digital signal, respond with ACK and wait signal, and signal the core that something needs serviced." (example: communicate with a GPIB device) or "Have a tight feedback loop that receives slow commands" (example: PWM for a motor with an encoder).


The learning curve of FPGAs is steep because it's a completely different way of thinking. And if you're thinking about a particular application, I agree that the Cypress chip might be the best solution in a bunch of cases.

But the original question was which would be a good FPGA learning platform. I don't think the Cypress MCU are the best way to learn about FPGAs.

Similarly, your example of a system clock is only relevant if you're thinking about designing your own boards. That's even more beyond the scope of the question.


Does anyone have experience with cloud-hosted FPGAs (e.g. on AWS), and if so would you recommend this platform to get started with FPGA programming?


You should not use AWS F1 instances if you are just starting out. They will burn a nice-sized hole in your wallet and the only thing you'll get in return is confusion and frustration as a beginner.

The AWS F1 devkit and APIs are largely oriented around making OpenCL-based designs easy and fast to develop (using Xilinx's "SDAccel" framework), but if you're just starting out you almost certainly want to start with classic RTL development using Verilog or whatnot. But "traditional" RTL development is substantially more involved. The documentation and the environment itself is largely oriented with the assumption you're a semi-experienced RTL developer and there's no real way around that part. (For example, even when using OpenCL, you're going to have a hell of a time optimizing anything without understanding real RTL/digital design principles.)

The F1 also has a non-trivial build system, meaning you have to do literally everything on AWS. Not a deal-breaker, but considering the equivalent of a "Hello World" will take you something like 30+ minutes to compile using something like a t2.2xlarge, and anything beyond that will be multi-hour (yes, this isn't an exaggeration) -- you probably don't want to just be spending money on every minor iteration.

I would suggest getting a cheap Lattice FPGA and using Project Icestorm, which is a free FOSS Verilog toolchain, in order to get started. This will cost you closer to $20 USD. https://www.latticesemi.com/icestick and http://www.clifford.at/icestorm/ -- there are allegedly good books on Verilog, but I'm afraid I can't recommend any myself...

After that, when you've got a big enough design where the monstrous VU9P -- the Xilinx chip used in the F1 -- is relevant to your interests, you'll know. :)

Source: I worked on moving/porting our designs to the AWS F1 (traditional RTL based stuff, not HLS/OpenCL -- and it was surprisingly challenging, up-to and including talking with AWS engineers directly to resolve bugs in their datacenters. And I was the software guy!)


This article makes me wonder if adding FPGA units to existing CPUs and GPUs that integrate with existing silicon makes a lot more sense than having a completely separate FPGA.


Integration of cpu and fpga has been done for years.


But not in household CPUs which I think is what GP meant. CPU speeds don't increase anymore, having an FPGA that can be programmed by software such as photo editors or encryption engines, things can be sped up a lot. (At least for crypto, one reason to stay with particular algorithms is hardware support; we'd be more agile if we could implement that in software without sacrificing much speed.)


Do checkout Xilinx's Zynq series or Altera's Cyclone series, I think that may be what you're inferring ?


I wonder if FPGAs would make good crypto devices, e.g. for disk encryption? If you implement AES on an FPGA, is it fast enough to keep up with at least a SATA 3 SSD?


Yes, they can do traditional cryptography pretty well (depending on the algorithm, of course.) AES-128 in an FPGA can encrypt a full 16-byte block every clock cycle when done right. With 100Mhz clock that's about 2.3Gbp/s, well within SATA 3 transfer speeds. You can achieve that with a last-last-gen FPGA that will cost you a couple bucks out of pocket, and it will use a fraction of the power/thermal footprint of any desktop/mobile processor that doesn't have AES support. Likewise, you can easily scale this up -- 10gig/s isn't even worth mentioning because it's trivial, so you can think closer to 40Gbp/s and beyond (which, today -- still not that impressive, I'm just giving you an idea.)

The thing is, if you're going to do ubiquitous encryption for something like your SATA link at scale, with a lot of units being sold -- you're better off just using a dedicated ASIC with a fixed algorithm, and your performance/power profile will skyrocket even further.


The FPGA advantage I was thinking of is that FPGAs are not covered by any crypto export regulations in any country.

As well as the possibility of the crypto "code" (HDL) being open source.


Or maybe, if you're going to do ubiquitous encryption for something like your SATA link at scale, with a lot of units being sold, it would be nice to only update the FPGA firmware when the exploit will be found in the implementation of your crypto.


But no hardware engineer would think of it that way in such a hypothetical product scenario, if they were designing it. Because:

- If many units are being sold, BOM choices matter. People optimize part choices down to fractions of a penny on individual units when scale is large; ASICs and FPGAs are differences in dollars, it's a completely different order of magnitude. Power usage is similarly important for the same reasons. Cost is king, and nobody will buy/integrate your 20x more expensive SATA adapter when another alternative exists that does the same job, cheaper, faster, with lower power. So what about all that alleged 'security' advantage when nobody uses your chip at all?

- There is no indication cryptographic agility is actually advantageous for any given design, it can only be assessed in the context of a threat. It may in fact be a detriment due to exposing further attack surface (e.g. you now need a secure update mechanism). This is important because the design phase is absolutely critical and takes substantial amount of the overall development/market time -- so you don't introduce extra complexity if you don't have reason to believe you need it. (And it's also why you just tend to buy many components from other vendors, because paying a bill to them is cheaper than paying your engineers to recreate everything while assuming they won't fuck up. I'd guess that very few actual FPGA/RTL engineers actually implement AES cores outside of university, as opposed to just reusing an existing one...)

Ultimately all of this comes down to your design requirements for the product, but flexibility can come with costs and in terms of money it definitely is not free.


Hilariously, the update mechanism likely opens an attack vector.


Yes, they are used extensively in crypto applications for precisely that reason.


I thought that CPU’s with AES-NI can already keep up.

EDIT: in fact I just looked up some benchmarks for Ryzen and it can do 3GB/s per core. So that should be enough


It's not about performance. (Perf is necessary, not the goal.)

Doing crypto on a separate chip lets you keep they key away from system RAM and CPU cache, removing any possibility of leaks into other programs.


Just because it can doesn't mean it's tolerable to have one core of their CPU eaten up just working on AES all day. Plus if it wasn't task set to a CPU you'd see massive latency hits on disk access even the CPU could keep up. The kernel still has to schedule the task and load balance.


Yea some FPGAs could do that.


Nice article, but it's from 2013.


And to be fair, since then FPGAs have exploded in non ASIC prototyping use cases.

They're even starting to be used for some consumer electronics.


FPGAs have been used in the industry for decades, basically every time you need high throughput and/or very low latencies and you don't have the volume to justify making an ASIC then FPGAs and CPLDs are the way to go.

The article is not really talking about that though, it's more about having FPGAs in mainstream desktop computers. This is still far from a reality, even if Intel seems to be pushing for it.


I mean, I know, I've professionally written HDL for FPGAs.

My point is that back then (a decade or so ago and farther back) it was relegated to low volume, high margin products, explicitly as a replacement for an ASIC. This was due to the cost of the FPGAs.

The point of this article is that by embracing the reprogrammable nature, they'll make their way into places that in fact have the volumes required for an ASIC (which might not be nearly as big as you might think), but choose an FPGA anyway in order to reconfigure out in the field. We are starting to see this and I'm seeing fairly cheap consumer electronics positions (ie. products in the range of a hundred or so dollars) asking for HDL experience more and more in my area.

Desktops aren't the only mass platform out there.


Keep in mind ASIC fabrication and tool prices are way down compared to 10 years ago so "cheaper than an ASIC" is a moving target too.


Maybe the only one market segment where FPGAs aren't in wide use yet is desktop computers. iPhone 7 has iCE5LP4K inside, AWS provide FPGA-based instances, and Microsoft have even FPGA-powered NICs. https://www.theregister.co.uk/2018/01/08/azure_fpga_nics/


Most desktop and server motherboards embed small CPLDs and FPGAs for low-level stuff, like power sequencing.


This has been something FPGA fans have been pushing since I left college with my BsEE 30 years ago.


I'm interested in seeing some examples of their use in consumer electronics. I think this article would be a lot more compelling if it had a section on "You've already bought one if you have a (hypothetically) Roomba vaccum/ Tesla Model S/ Thinkpad laptop/ Ubiquiti router." I'm pretty sure most of those don't have FPGAs, but would be curious to see a list of things which do. I am by no means a representative consumer, but I know there's an FPGA in my:

- Rigol oscilloscope

- National Instruments DAQ equipment

- Mecco laser cutter

- DJI drone

Any others where I might not have taken it apart and seen the big QFP with "Xilinx" or "Altera" on the circuit board would be interesting!


The iPhone 7 has a iCE5LP4K. Those little CPLDs (which are full FPGAs these days, but fighting that terminology seems to be a losing battle) are in everything. Which is fair, they're fantastic power sequencing controllers and great for coalescing I2C and SPI buses into something more easily consumed by the application processor. Think an FPGA that constantly polls sensors, wrapping their chip specific format in a way that let's them filter on the CPLD and only wake up the application processor when something interesting happens.


FPGAs are starting to see extensive use in the high end audio market for implementing custom filtering, DSP, and discrete DAC mapping algorithms.

Here's an interview with Rob Watts, a DAC designer who's getting the best measurements in the industry right now with his Chord DAC's (specifically Chord DAVE) http://www.the-ear.net/how-to/rob-watts-chord-mojo-tech


I think they're more common than you think

The Model S has at least one FPGA, as do some of Ubiquiti's products. I don't know about the Thinkpad, but Apple has used cheap Lattice FPGAs for interfacing between different hardware components. Macbook Pros have used them in the past (IIRC for driving the display and interfacing with the battery), as have iPhones.


I was surprised to learn that Roland's JP-08 synthesizer uses an FPGA to simulate an analog synth.

https://youtu.be/zIFLdka9kTM?t=7m34s


Digital audio, DACs in particular. From high-end dCS and Chord down to cheap (but fairly widely known and highly regarded) Singxer USB-I2S converters.


There are FPGAs in both vive headset and wands


and I still haven't bought an fpga :p lies (jk)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: