For a number of years when working at Misix, I developed and maintained a recommendation engine for at-auction vehicles. This project originally started with marketing emails, and developed into a component they would use on their e-bid website to advertise buyer-specific vehicles. During development, the recommender of the system matured from something in JS that filled HTML templates and sent emails, to a robust and well-designed set of shell tools written in Rust.
The early days
When this project started, I was on a silly, immature kick of "I'll use node.js for everything." I feel like that hampered progress, which I'll explain how I learned to overcome in a bit, but thought it to be important to offer this perspective. We are not always proud of our work, but when we can recognize when it sucks and move past it, progress is achieved. This section is one example of that.
At Misix, I asked the bossman (Andy) about making a recommendation engine for these cars, and it turns out that he was already talking about the same idea. This fasttracked us on a project, and we started building one intent on sending marketing mails to prospective buyers. Over a couple months, I developed the programs to execute it, and Jonathan (an economist) tested the accuracy of our recommendations.
At first, because we were sending marketing emails, I designed the program around that. That assumption later bit me, and modifying tons of parameters became difficult. It was written with Node.js, and while it allowed fast prototyping, the execution speed was extremely slow. Especially troublesome were the inefficiencies in design. When we started, we only had a small batch of people to recommend cars to. I used database queries for everything, and that turned into a bottleneck when we expanded the pool of buyers. It was clear the volume of recommendations in a full set was going to take more time than my computer had, so I later rewrote it in Rust as a few command-line components.
The JS recommender program had the following components:
- SOAP downloader (get vehicles from customer)
- Vehicle database
- HTML generator
- Email sender
- Analytics downloader
Due to the inefficiencies, I redesigned our recommender system around the unix philosophies of "do one thing well," "be small," and "be generic." Redesigning some components as shell tools made parallellism natural and automation convienient with GNU Parallel. CSV/TSV was the intermediate format of choice, and was easy to split up and recombine for different jobs in the batch. The code was written in Rust, which compiles into very fast machine code. In all, it took the recommendation processes from a day in JS down to minutes in Rust. The components were far more modular, and additional sections were later built to accomodate stuffing/sending more marketing emails.
The following is the process used to produce assets:
- A query to the database produces a CSV of historical inventory.
- A query to the database produces a CSV of current inventory.
- The historical inventory is piped into the profiler program, which produces a CSV of buyer profiles.
- Both the CSV of current inventory, and the CSV of buyer profiles are piped into the ranker profile, which produces a CSV of ranked inventory. This is the deliverable asset.
I didn't stay at Misix long enough to see the end of this. The project was stalled for a long time. From what I heard secondhand, our customer never did use my code in their website because their Windows IT couldn't figure out how to compile my source, and their .NET engineers couldn't read my Rust code. So, they ended up not using it.
I have uploaded the source code to the Rust version of the engine, for a couple of reasons. One, it is a good example of my clean and tightly-written code. It is also a good example of design of command line utilities, that do one thing and do them well. It works on any kind of inventory, ranked for any kind of attribute. While it was written while I worked at Misix, it was never used in any commercial product, and the company is now defunct. Beyond that, it is so small that I don't know if it's actually copyright-able; there isn't anything "new" with it. It was purely an exercise of vector algebra. Download it here; the git repo has been stripped, since it may have contained customer-specific information.