Writing a JSON parser is a good way to teach yourself better programming practices. I attribute my understanding of pointer arithmetic and i/o streams to my own efforts in parsing/generating JSON.
There is an objective benefit to using `constructs-to-c` and that is something you mention in your original comment: you generate a single binary file containing all of your business logic, thus avoiding running a "fully interpreted" app in production.
In order to determine if it gives a performance boost to your particular rules engine, you should test it out and benchmark it with something you'd consider a "real world" example.
> you should test it out and benchmark it with something you'd consider a "real world" example.
I thought that you might have specific experience with "real world" applications, thus my question.
The CLIPS "Advanced Programming Guide" states in secion 11.1, that "...compiles all of the constructs [..] into a single executable and reduces the size of the executable image. A run-time program will not run any faster than a program loaded using the load or bload commands"; so as far as I understand it, the code is still interpreted (i.e. not affected by the translation to C).
Good find in the apg! That thing is a wealth of info. It look like contructs-to-c wouldn't make things faster, but would instead reduce the compiled code size.
If I'm understanding your meaning, you mean "interpreted" in a sense that it translates to C code, and C code is interpreted and translated into machine code. Is that accurate?
It's possible that you and I are referring to different things when we use the word "interpreted." When I think "interpreted" in the context of computer programming, I'm thinking in terms of interpreted-at-runtime programming languages like Ruby, Python, and CLIPS before you use `constructs-to-c`. In my mind, using `constructs-to-c` changes this from an "interpreted" to a "compiled" language in that the expert system you write in CLIPS code is its own language with its own data structures, algorithms, and DSL. When you use `constructs-to-c`, you "compile" your application using the programming language of your expert system, thus it is no longer interpreted at run time.
> you mean "interpreted" in a sense that it translates to C code, and C code is interpreted and translated into machine code
No, I mean that the Lisp variant of CLIPS is a pure interpreter, (assumingly) not even threaded, and (certainly) no JIT. The "constructs-to-c" feature apparently doesn't change anything to that, i.e. even in the "translated to C" version of the CLIPS code it is still interpreted.
> I'm thinking in terms of interpreted-at-runtime programming languages like Ruby, Python
Me too.
> and CLIPS before you use `constructs-to-c` ... changes this from an "interpreted" to a "compiled" language
Didn't find evidence for this so far; as far as I understand it's still the same interpreter.
I'm sorry, but I still think I'm not sure what you mean by interpreter. I hope you don't mind if I press further, but I'm very much enjoying this deep dive into the CLIPS C API with you.
When you compile your code including the c files generated by the constructs-to-c command, you must update the main.c file as well as the makefile that ships with CLIPS. During that update, there are a few things you must do:
1. add the generated c files to the makefile
2. you remove the function calls in the main.c file that cause CLIPS to capture STDIN and STDOUT. That is: you remove one layer of abstraction that sits between your rules and i/o streams.
3. you remove the generic CreateEnvironment function call in main.c and replace it with InitCImage_1, a function that calls CreateRuntimeEnvironment.
CreateRuntimeEnvironment differs from CreateEnvironment because you are able to pass in pre-built:
Which are your rules engine constructs represented in C.
In addition to these four things, CreateRuntimeEnvironment passes in functions the user may have defined as UserDefinedFunctions (UDFs) in C. This is a more direct path than having CLIPS first interpret CLIPS rules like `(defrule foo =>)` and then translate them into these C representations.
I didn't find the time yet to check the details in the source code, so I can only speculate from experience with other VMs. As it seems from your description the constructs-to-c version removes the interactive part (REPL) which is apparently not needed in stand-alone applications. But that's unrelated to the question in which format the code/rule-base is kept and executed. The C files could embed both a text or a byte code version of the CLIPS code/rule-base, and the interpreter is implemented in C anyway. But I think I have to check in the (generated) source code (the architecture manual doesn't even mention the term "interpreter"). For some years I was thinking about using CLIPS for algorithmic music composition, but it might be too slow; seems to be not trivial to get reliable performance figures.
> For some years I was thinking about using CLIPS for algorithmic music composition
This is a great use case! I'd be interested in seeing what you come up with.
> in which format the code/rule-base is kept
When you use constructs-to-c, the generated C files represent your rules engine constructs as pointers to pointers of CLIPSLexeme, CLIPSFloat, CLIPSInteger, and CLIPSBitMap structs. It does not store it as the raw clips code fed into the interpreter. You can later use functions like `save-facts` to generate the rules in CLIPS syntax based on these pointers to pointers of structs.
Ok, then it's aparently the intermediate representation (what the parser feeds to the evaluator), which makes sense. The evaluator doesn't care, whether it comes from the parser or from constructs-to-c generated code.
> Ok, then it's aparently the intermediate representation (what the parser feeds to the evaluator), which makes sense. The evaluator doesn't care, whether it comes from the parser or from constructs-to-c generated code.
> I thought that you might have specific experience with "real world" applications, thus my question.
As to this point: I don't at present have "real world" experience writing CLIPS. My experience thus far has been entirely research-driven based on observations I've made in "real world" scenarios.
If you want to read about some other people who have used CLIPS in real world scenarios, here's the most comprehensive HN thread to date so far on the subject:
Thanks. I was also in the discussion. When CLIPS is used as an expert system shell, then performance comparisons are pretty restricted. But since you have seemed to use it more for "indexing" and "caching", which is a domain shared by other interpreters used for the web, a performance comparison seems more feasible, especially if you already have implemented such features with other interpreters as well.
> you have seemed to use it more for "indexing" and "caching", which is a domain shared by other interpreters used for the web
That's the idea! I think CLIPS is a good framework for a generic run-loop (or `while` loop). If you can conceptualize your app as such, you can reap the benefits inherent in the rete algorithm within your application logic.
I've been studying Rete for the past few years, and I've been working exclusively in CLIPS for the past year, a programming language that uses the Rete Algorithm at it's core. It's a very clever algorithm, and provides a lot of "nice features" that developers find themselves re-implementing in mature code bases. Things like:
- pattern matching
- custom DSL
- caching
I find it difficult to summarize in an "elevator pitch," and I only started seeing things fall into place after trying to use it in earnest on my own. I highly recommend reading the CLIPS documentation posted on the main CLIPS website. The PDFs are long, but quite complete and well written.
Just a heads up: this is not light reading. Rules-based programming is a confusing departure from traditional programming; there's more "magic" involved, similar to convention-over-configutation in other languages/frameworks. The benefits outweigh the upfront learning cost, though.
If anyone is working directly with CLIPS, let me know! I'm actively working on a low level networking library called CLIPSockets, and would like to work with the language full time some day.
Hey! I was using CLIPS in my network management PhD work, but life happened and I was absorbed by my profession. Maybe one day I'll dig up the half finished material and continue, who knows.
CLIPS is one of those definitely underrated gems, many many thanks for what you are doing!
I just ran into this this week implementing a socket library in CLIPS. I used Berkley sockets, and before that I had only worked with higher-level languages/frameworks that abstracts a lot of these concerns away. I was quite confused when Firefox would show a "connection reset by peer." It didn't occur to me it could be an issue "lower" in the stack. `tcpdump` helped me to observe the port and I saw that the server never sent anything before my application closed the connection.
I just made CLIPSockets public, which is a culmination of learning I've taken from trying to make web applications using primarily CLIPS. Thought I'd bump it here:
Holy moly. I had my suspicions Magic would be a good candidate for something like CLIPS when it comes to software implementation. Do you have a source for that info? What an amazing bug.
The source for them using CLIPS is a conversation I had with Arena team lead Ian Adams in a Magic-community Discord.
The source for the bug is a video WotC did that I can't find right now that featured the Arena team talking about developing Kaldheim - the bug came from the card Alrund, God of the Cosmos.
> So first, a quick summary of how the rules engine works. When a game of Magic is in progress on MTG Arena, the program that is tracking the state of the game and enforcing all the rules-correct card interactions is called the Game Rules Engine (GRE). It's one of the two main programs that we work on. It's written in a combination of C++ and a language called CLIPS, which is a variant of LISP.
The only way I've found to back up the claims: try to prove them myself :)
or at least attempt to. I'm fairly convinced CLIPS is an unexplored middle-ground answer to conversations around "does our company really need AI/ML? SQL is enough."
I also see CLIPS as a declarative abstraction around our classic imperative CRUD applications. It exposes abstract concepts within the language that we end up implementing ourselves, like application caches (working memory), pattern matchers (LHS of Rules), and permission systems (in CLIPS, the object-oriented concepts known as "COOL" list User as a separate object inheritance chain from the other objects).
Consider PostgreSQL, a popular database that one could consider a Rules Engine:
1. records stored in tables can be implemented with facts in working memory
2. constraints, foreign keys, views, and other abstractions that come with RDBMSs which we slowly replicate within our application layer over time can be implemented with defrules
3. and you can even add "Rules" (https://www.postgresql.org/docs/current/sql-createrule.html) proper in psql
From this, we could conclude PostgreSQL is an example of an application you can build with a rules engine. However, the ambitions of CLIPS is to be a tool used for creating "Expert Systems." Whenever I bring up that term with other people, I'm met with:
1. blank stares
2. scoffs
regardless of their profession.
However, I argue that ChatGPT and other AI/ML chat bots are highly advanced "jack of all trade" Expert Systems. I also argue that some very successful web applications, such as the Pennie Pennsylvania health insurance application and TurboTax are "specific" Expert Systems. In all of these systems, you interact with someone who is the "expert" and can carry a "conversation" on the topic you specify.
We already use imperative programming languages that query remote databases specifically written to store data. I think Rules Engines (CLIPS) is "low hanging fruit" because, at its core, it's "lower level" than an RDBMS in terms of abstractions provided to the developer, but the implementation of the algorithm that interprets CLIPS code is closer inline with AI/ML. ie: neural networks are built based on inferred rules from the data they're trained on. Rete networks are built based on explicitly defined Rules by the developer. Thus, CLIPS is sort-of like having a formal language for interacting with a simplified neural network.
I hope my efforts are reaching developers on teams who have reached the point in their product's life when CLIPS "would have been a good idea to start with."
The difficult part: forgoing the temptation to reason "we don't need to use Foo, our product doesn't need all of that."
If any of the above resonates with you, I encourage you to read the CLIPS documentation and try to build something fun with it. You might be surprised at what you learn.
I’ve worked on systems that manage risk, and we had two pieces:
- A statistical model for probability of default
- A rule-based model implementing a policy, taking into account the probability and other signals not captured in the probability
Not everything could be incorporated into the probability, so this was a nice way of having a hand-crafted decision tree. Also, we could tune it eventually, much easier than training and validating a new statistical model.
You bring up something awesome about CLIPS: it's easy to tailor it to your needs. Once you have your CLIPS code, you can trim out the parts you don't need using compiler flags. Once you have a compiled `clips` binary with only the constructs you need (defrules and deffacts, maybe?), you can execute it, load your rules/facts into the engine, and then generate C code for your specific rules engine. This compiles into an even smaller/faster binary. You can do all of this from within CLIPS proper. Check out the Advanced Progamming Guide section 11: Creating a CLIPS Run-time Program https://www.clipsrules.net/documentation/v641/apg641.pdf