Computer vision has become commonplace across innumerable industries, but the methods of creating and controlling these visual AI models aren’t so easy. Viso is building a low/no-code end-to-end platform that lets companies roll their own computer vision stack, and they just pulled in $9M to scale up.
There are tons of computer vision models and services out there, of course, but a lot sort of fit the description of “model as API.” Say you want to do person recognition and rate whether they’re standing or sitting, so you can tell how busy a train station or restaurant is.
There are fully-forrmed options out there for you for person and pose recognition, but they may not fit your use case, or security model, or they’re too expensive to scale with. Building your own is an option, but the expertise required to train and deploy modern CV models is non-trivial: unless you have the time and money to stand up a real team, it may be out of your reach.
That’s the type of situation that Viso wants to remedy, by providing a platform to create an enterprise-grade CV model of your own without dedicating the kind of time and resources that it often takes.
“Early in the adoption cycle, companies resort to buying/renting pre-made computer vision systems. However, they eventually need to bring all computer vision initiatives together (streamlining), and deeply integrate and customize them, and also ‘own’ them because the data is sensitive and the technology of strategic value. This is why companies across those industries are starting to hire AI engineers,” explained Viso’s co-founder and co-CEO, Gaudenz Boesch.
But unlike for many other enterprise-level needs, computer vision lacks a “specialized infrastructure” to efficiently build and deploy it.
“Companies have to build it from scratch, trying to assemble a plethora of disconnected software and hardware platforms (cameras, servers) across the organization,” he continued. This in turn requires expertise across numerous domains that quickly grows too expensive.
Viso’s approach will likely look familiar to anyone who has used no-code tools in other contexts. It amounts to a series of modules, both pre-built and customizable, that let a user select, train, and deploy computer vision models as needed.
Of course, you’ll still need some level of expertise – which object recognition model should it run? Where will training data be kept? How is inference handled? But a handful of engineers can do the work of far more, and all in one place rather than scattered across a dozen tools, APIs, and code notebooks.
Viso says it’s end-to-end, and that doesn’t seem to be an exaggeration. Computer vision requires data to start with, and training processes, and then implementation, hosting, compliance work, and so on — and it seems to really be a “soup to nuts” solution that puts all of that in one place:
So if you were making that “busy detector” from earlier, you could conceivably come into it with nothing but a hundred hours of footage and come out the other end a week or two later with a complete product. That would include low-level analysis and storage of the raw data, annotation and labeling, training and testing of the base model, product integration, deployment online or offline, analytics, updates and backups, as well as access and security… all without leaving Viso, and probably without touching the semicolon or bracket keys. (There are various case studies here.)
Though there are other computer vision platforms out there, Boesch said none were “built to manage highly complex computer vision applications at scale, and maintain them continuously,” instead being more focused on a handful of tasks from the above list. Viso aims to support as many models and methods, hardware, and use cases as possible, while ensuring the customer owns the end result.
Not being a developer myself, I can’t speak to how difficult or easy different use cases might be, but certainly there is a fundamental attraction (as evidenced by the popularity of other low-code and end-to-end tools) to using fewer and more comprehensive platforms rather than stitching together a series of disconnected ones.
Viso’s investors seem to think so, and the company has raised $9.2 million in seed stage funding, led by Accel and with various angels participating. Interestingly, the company has been bootstrapped since it was founded in 2018 in Switzerland.
Boesch said that exploding demand caused the company to do the raise, which by AI company terms is quite modest compared with the products on offer and existing customers. He said Viso has already been adopted by several large companies, including Pricewaterhouse Cooper, DHL, and Orange, and has experienced 6x in new customer growth since 2022.