Vlad Pomogaev

Home

Finding YC Startups to Critique with Vector NLP and a Critique of Atopile

Nov 28th 2024

Intro

Over the past year or so I've been trying to start a "business". While looking for a problem that my business can solve, I came across a quote by Paul Graham in his essay, "How to Get Startup Ideas":

The way to get startup ideas is not to try to think of startup ideas. It's to look for problems, preferably problems you have yourself. The very best startup ideas tend to have three things in common: [A] they're something the founders themselves want, [B] that they themselves can build, [C] and that few others realize are worth doing

My problem is that I feel like I don't have any problems that are solvable by a startup that meet points A and C. All of the problems I deeply care about are systemic issues. Global warming, housing crisis (I live in Vancouver), rampant capitalism, etc etc, are all issues that I would love to have solved, but they require far more than my input, and are themselves difficult ways to make money, or at the very least they are a very roundabout way of "starting a business".

Another angle on this is the following insight I have. I'm a deeply technical person. I've only worked at engineering companies with peers that I would describe as very capable, well educated, confident, hardworking, and generally wanting to solve problems, a.k.a. I got very lucky and have only worked at dream companies. This means that, God forbid, there is a "problem" at my workplace, there is a dozen or so engineers just waiting around to fix the issue, or design a solution. If we had a problem and we needed a solution, why should we go use some startup's services? That takes time, the solution might not exactly fit what we need, it won't integrate as well, it's probably going to be expensive and have opaque pricing hidden behind a "contact us" button. Why wouldn't we just fix it ourselves?

So ironically, deeply technical people might not be exposed to those ripe "problems" that Paul is talking about because the problems they have in their local network are already solved. For me, that leaves me with the final problem I have, it's the fact that my job exists! If I could automate it I would.

Systemic Search for Startup Ideas

In this video, YC's Jared Friedman comments that it's possible to sit down and explicitly come up with startup ideas. While the method isn't ideal, I was so stuck that I still wanted to give it a shot. This led me down a road of drawing mind-maps, having circular conversations with ChatGPT, and critiquing YC startups, which is what I'd like to describe now. Here is the proccess:

Look at YC's list of companies. Identify the companies that are working in a problem space that you have experience in. Then, critique the companies as an expert in the field. I ask myself the following questions:

  1. what problem are they claiming to solve? what problem are they actually solving? are they the same?
  2. how frequent is this problem?
  3. is the solution good?
  4. is there an even bigger problem that this startup is *not* solving that you can solve?

Bag-of-words with Cosine Similarity

My general approach is to take a list of my technical skills and sort the YC companies based on how close they are to my list of technical skills. Then use that list to find which companies I should be looking at to get inspiration.

Since YC has funded over 5000 startups, taking a look at all of them and identifying if you are expert enough to critique them is a chore. In comes my favourite NLP tool on planet earth, a bag-of-words vector-space model. This tool is so useful that it's my go-to method for text-based search and text-based clustering. The Wiki article is interesting and I recommend you read it.

  1. "Scrape" the following page by loading it, right clicking, and saving the body as an HTML file. The output is a truncated DOM that is an array of links, with divs scattered everywhere. Inside there's some spans that provide details about the company. You have to scroll down to get the whole page.
  2. Clean up the DOM by converting it into JSON.
  3. Load a sentence-transformers/all-MiniLM-L6-v2 model and produce a vector from the description of every company
  4. Also, produce a vector from a file that lists a collection of skills
  5. Calculate the cosine similarity between every company's description and my own, then order the companies by similarity

The GitHub is here. And yes, this is AI generated code, but debugged by a human of course!

The Unreasonable Effectiveness

Bag-of-words vector techniques work so well that I'm astounded every time.

For example, take a look at the description of DryMerge, a no-code LLM solution to writing scripts:

Automate work with plain English

If you just look at the individual words in this sentence, there's no mention of "AI" or "scripts" or "programming" or anything of the sort. Yet this description fit very close to my predefined skill sets, which includes "AI" and "scripts" and "programming" etc. Why? I speculate it's the "automate work" part. It's possible that the vector embeddings know that "AI" is synonymous with "automate work", so it embeds that in a vector component, which then matches my vector closely.

A Review of Atopile

Autopile is at the top of my list of startups that match my skillset. Their description is:

"We make tools to design electronics circuit boards with code"

Let me describe some of the challenges in PCB design that I think Atopile is trying to address.

The first biggest problem when designing PCBs is to transform your project requirements into a schematic diagram that fulfils said requirements. The requirements could be what you want your gadget/subsystem to do, what inputs and outputs you have, the transformations to electrical state you want to perform. You need to take a good understanding of the requirements and come up with a circuit diagram that solves those requirements. This means taking components and drawing logical/electrical connections between them. The difficulty comes from:

Where I think atopile shines is when it comes to capturing (some) of the electrical requirements and making sure that those requirements are fulfilled. For example, if one of your requirements is to "have" an STM micro on your board, that's pretty easy to add. If that micro itself has a requirement of being powered, it's implied that there needs to be a connection to a power supply. With ato's code, it seems like it's easy to do a "static" check for this.

That being said, the usefulness of this tool will come down to how deep these nested requirements can be checked, and how visible these checks are to the designer.

For example, let's say you have a power supply powering an MCU. The connection between the power supply and MCU is checked to see if it exists. What about the input connection to the power supply? I hope that's checked too. It seems like the language-based descriptions would make it easy to check for these things, either manually, or via an automated linter.

Let's say you have an FPGA on your board. The FPGA has several I/O banks with different voltages. The datasheet specifies that depending on the voltages, the supplies need to be ramped up in a certain order. Presumably, this is because there are diodes between the rails of the chip, but that detail is not specified in the datasheet. Would this tool be able to check that this requirement is satisfied? Probably not! That would require a structured representation of this requirement to check against. While it's possible to store and check for these kinds of things, it's a) not standardized, so every part needs to have a unique requirements specification b) some requirements are difficult to contain in a structured form c) there are always exceptions to the rule, minor caveats that cannot be represented.

What about voltage ripple? Startup currents? Power dissipation?

Consider that some of these datasheets are literally 1000's of pages long. You might think that your PCB is "very simple" and doesn't have these types of parts, but unless you have purely passive components on your board, that's probably not the case. The absolute barebones MCUs have datasheets in the hundreds of pages. How much time are you really saving by defining your schematic in a text based form if you still need to read the entire datasheet, understand it, and check the requirements?

Now, maybe you can use an LLM to understand these requirements directly from the datasheet, which will solve the problem of having the requirements be laid out in a structured form. Would you trust this kind of check is done correctly by the LLM? Would the LLM know to check certain things if it isn't in the training data?

To summarize the schematic capture benefits of atopile: the benefits are limited to the most basic of checks to your design that are not currently well represented in EDA tools.

The second biggest problem with PCB design is transforming your schematic into a PCB layout that is a direct transformation. What I mean by this is the PCB should implement the schematic in the ways that matter.

Most digital designs have some sort of high-speed components on them. Similar to my arguments above, this tool does not address checking their requirements. High-speed design? Impedance checks? There are the problems that take the most time out of a designer and are the most likely to be screwed up.

Now, the one benefit I see of this tool is that it contains sample layouts for groups of components. I think this is the real winner in this tool. I start every PCB layout by grouping related components, performing a "mini-layout" on said components, while taking into account thermal dissipation and mechanical requirements. Usually these components are already grouped on the schematic page in a hiarchical sheet. Only after these mini-layouts are done do I go and group these modules together.

Most of the time I place related components on a single side of the board anyways to begin with. I move some components to the other side only if I need to.

Atopile seems to have a database of these "modules" already pre-laid out, which could save time.

But... let me be really cynical here. How much of a benefit is this really to a designer or layout engineer?

All but the most simple of board designs of mine have gone through more than one iteration. Every single board I ever made that ended up in a product has had multiple iterations. That means that after the initial grunt-work of Rev 1.0, the rest of the layout changes are minor and don't require these "mini-groupings".

Likewise, most of the Rev 1.0 boards I worked on have themselves been derivations of previous designs/other boards. We simply took the project files, made a copy, changed the name of the project, ripped up 50% of it, and re-made the design.

The most difficult layout issues have been the aforementioned high-speed traces, which congregate around DDR, memory, LVDS, and high-speed busses, which means they are concentrated around the MCU. There is massive benefit to reusing the layout around the MCU from previous projects because that board has already been validated. You know for certainty that the transformation from circuit to PCB was correct before, so your chances of performing the transformation correctly again go up.

So to me, the benefits of this minor layout tool are diminished the more reuse exists in a PCB design.

Where can Atopile do better?

CI integration is akin to checklists in conventional PCB design. Companies tend to produce checklists from past failures and the known failures to the engineers working at the company. You don't want to be one of those individuals who's name is memorialized in a checklist. Trust me, I would know.

There tends to be a "big" checklist for all designs. This checklist is kept in mind by the designer during schematic/PCB design, and then reviewed at critical milestones. Usually it's stored as some sort of text file, web form, or spreadsheet.

There are several annoyances with this checklist system:

I can think of a lot of different ways of solving these problems. Some of which can be solved via existing tools.

However, I think that Atopile is uniquely positioned to solve this checklisting problem. It's not perfect, but having a dependency graph you can derive from your ato code would help to solve dependencies. By importing modules in a way akin to Python packages you can store which parts of a checklists are applicable. Having integrations into KiCad would help identify which checks are invalidated when changes are made (PCB or schem). Scripts can be used to automate some checks. CI and version control? I propose 'git blame'.