Calvin French-Owen (Co-Founder of Segment) on learning to program, his reading list, and his work on COVID data visualization.
Developer Den is a series of interviews with notable developers in our community to learn more about their journey into engineering. We sat down with Calvin French-Owen, co-founder of Segment, a market-leading customer data platform.
To start, tell me about when you first got interested in computers.
I've always been playing around with computers. When I was nine or ten, my dad was in the software industry, He worked for HP in their calculator division, and worked at a few different ed-tech companies. He brought home a computer, and as a ten-year-old, I was excited about all the games that I could play.
I spent a lot of time playing old-school games like the Dr. Brain series and Math Blaster. My first serious foray into programming happened when I had a TI 83 calculator in junior high. Every student had to have one, and I found out that it had a little BASIC interpreter on it. While it was obviously good at different graphing functions and math equations, the programming language itself was both fairly limited and had all the documentation built-in. For a “goto” function (yes, goto was one of the main control flows, don’t judge), you could see what the syntax was and what it should do. So I built a bunch of calculator games, adventures and animations for my friends and myself.
Now that you've built a business, now that you've written many programs — is there a moment that stands out to you as, "this is where I got really good as a programmer?"
To be honest, I think programming mostly comes down to reps. It's almost like lifting weights — you build the muscle over time. It didn't quite take for me the first couple of times — it wasn't like I intuitively understood: "Hey, this is programming." Over time you get a little bit better at it, you learn new tricks, and you really understand how a computer works.
That said, there were a couple of classes that I took in undergrad that made a huge impression on me. There was one about computers and distributed systems (MIT 6.033), which really explained the way the internet works. There was one on computer architecture (MIT 6.004), which I felt like gave me a good sense for what the machine is actually doing. Both of those were a great foundation.
Then, the thing that made me feel like I got a lot better at writing code was just doing it a crazy amount.
Every year in January, there's this annual competition at MIT called Battlecode. You write a player for this game that's kinda like StarCraft. Your program has to control what all of these little robots do to try to defeat the other players. I competed in that and was spending 15 hours a day building pathfinding algorithms and goal‑setting algorithms, like whether to shoot at the opponent or retreat or how to allocate your resources. I would go to sleep thinking about it… and wake up the next morning with a new idea. That borderline obsession made me a lot better.
Then the first year and a half of our startup journey, was this insane learning firehose. For the first time I was expected to spin up a database on AWS — and it's 2011! We had no idea how to do that or how it works. And it's like, "Yeah, you have to set up these EC2 instances, and network them, and set up the right disks, etc.” We had to figure out how to actually build the service for people. That was a huge learning experience for all four of us.
Now that Segment is out of your hands, how do you keep learning? What have you been reading recently?
I've been reading a bunch! In November when I stepped away, I tried to explicitly expand my aperture of what stuff I was learning. I read a decent number of design and data visualization books. I've been reading different parts of [Edward] Tufte and [John] Tukey. I read the Richard Hamming book [The Art of Doing Science and Engineering], which I liked a lot. I read this really interesting book on economics called Radical Markets, which has some very out-there ideas for how economics might work in the future. I think a bunch of those are really interesting thought experiments.
One of my personal goals has also been to actually step up my learning culture and habits a little bit. I've started documenting everything in Roam, the note-taking app. The thing I like about it is that it matches the way that I normally think. Each day you get a new daily set of notes and it's up to the tool to connect all these different concepts for you. I don't know if I'm using it right, but I like what I'm doing so far.
The other thing I've started doing a little bit is spaced repetition. It definitely works, which is bizarre. You look at this flashcard that you saw one month ago — or three months ago, or two days ago, depending on how fresh it is — and you remember it! But you're not quite sure how you remember it.
Besides that, right now, I'm reading this book called The Elephant in the Brain — kind of like the elephant in the room. It examines a bunch of the self-deceptions that humans keep, and tries to explain why we keep them, and why we're not that good at introspection. That's been really interesting so far.
Along those lines, what are some of the best tools for learning that you've come across, whether it's code or processes in your life?
There are a couple of key parts to it. The first, which I feel like I haven't gotten down a hundred percent, is the concept of “entrypoints”. I think one of the best things that undergrad gave me was exposure to a wide set of fields and people, and also the right entrypoints. College can do a great job of saying "Here's what you know. Now you can enter at this point and go deep in this field and find a bunch of good resources." It’s much better than a random google search which just returns the most SEO’d garbage.
I find for general-purpose knowledge, it's pretty hard to locate those entrypoints. A lot of times I'll search for a subreddit — like, "Hey, what would a smart friend who's into this particular thing tell me? Okay, let me search for a subreddit of people like that." Other times I feel like there's this deep web of important papers in the field or interesting blog posts, stuff that people think of as canon, but are hard to find. I do think it's a problem — how do you find the right entrypoint?
Beyond that, I'm a big fan of trying to synthesize what I've learned and then tell someone else. That's part of why I put stuff on the blog or keep a bookshelf on my website. I find I remember it better when I'm trying to light up that lightbulb in someone else's mind.
The last one that I've been noodling over is: who your peer group is. Definitely most of my book recommendations and stuff I get from other people is from people I'm friends with. Are there ways to widen that peer group? Or get some new recommendations that come outside your field?
Have you been hacking on anything recently?
I've been doing a couple of different things. To be honest, most of my time recently has been working with the US Digital Response on vaccine distribution.
A big problem that people have with vaccines is they're trying to find appointments. A lot of the current sources out there are good at mapping supply. They're saying, "Hey, we shipped a hundred doses to this Walmart." But they're not so good at actually finding appointments. As you guys both know, as the public, you want to be able to book an appointment!
Myself and a couple other folks have been building an appointment finder, which runs right now on AWS and checks [for appointments] every two to three minutes.
Calvin shows a list of vaccine appointment availability.
This is just running for a few states right now. It tells you where there are appointments available and it gives you links so you can easily book them. Right now we're trying to spin this into a national thing that you can get everywhere. That's been most of my time recently. As an aside: if you’re in tech and looking to volunteer, go sign up for USDR! It’s a fantastic way for senior folks to help government programs work a little better.
The other thing that I've been hacking around is an exploratory dataviz tool, which I can also show you. It's only running locally.
The general idea is: you're a technical businessperson and you want to visualize large amounts of data and you maybe don't know how to code or write SQL: the tools today are not very good. You can dump stuff in Excel, but it crashes for large data sets. You could use a BI [Business Intelligence] tool like Looker or Metabase, but they're kind of clunky and hard for exploration use cases, though they may be okay for reporting. You could learn to code or write SQL, but: Bret Victor has this line that, "when we write code, we're just blindly manipulating symbols." It's not very tangible — what's actually happening? I've been trying to build a product that solves some of those use cases that I would have as a data explorer.
Is your data visualization work building off problems you saw at Segment, or is it driven by your experience teaching people?
It's both, to be honest. I would call myself a very occasional data analyst at Segment.
In founder capacity, you do a bunch of different things. Sometimes you're writing for the blog. Sometimes you're trying to figure out how to fix a team. Sometimes you're trying to figure out how to talk to a big customer. One of the things that I would occasionally do is take a look at our data and see if there were any insights that I could get from it.
I would dig into our data sets and — remember I’m an occasional data analyst here — I don't have a firm understanding of the hundred columns in our user table in Redshift. There are like three that look like "industry," and I'm not sure which one has the actual values and which one is generated and which ones are just empty.
I would normally do my work in Python. But again, I don't know the pandas [data analysis toolkit] API a hundred percent. I don't know the matplotlib [graphing API] options. I have to look those up every time. It just felt slow.
Calvin shows a data-visualization tool displaying COVID data per-country.
Here’s the app. The general idea is that you could add in datasets, whether they're CSVs or a database table, and it should be really quick to explore and work with data.
I'm thinking about it like Visual Studio Code, where you can create different panes to look at: here it's just this COVID dataset. And immediately, it already gives me sort of a sense for how this data is structured. There's a "date" field, and "country." There's there "confirmed recovered" and "deaths." It grows over time. It seems like there's one entry per date/country. If I want to, I can filter this really quickly so I can look at everything that happened on April 25th, 2020. I can see all the different countries.
If I want to search for just South Korea, because they've been doing a good job with pandemic response, I could look at that and I can get a very quick feel of what this dataset looks like.
Potentially I could compare it, assuming I want to look at South Korea versus France or something like that.
Of course, loading stuff in a table is great for understanding what's there, but you probably want to be able to graph stuff too. If I hit command-K here, I can pull up a plot and it lets me plot the Y-axis based upon the number of confirmed cases. Already I can start seeing something that's interesting here — well, this isn't per capita, right, but it's interesting that the two graphs sort of follow the same shape. It seems like South Korea has been hitting a new curve, whereas France evened out for a little while. Ideally, I could click and drag one over to the other to superimpose them a little bit and get a better sense for how these two countries compare. And the nice thing is, it’s really fast.
Say I want to check out Brazil instead:they're doing even more cases, but that's been a steady climb. One of the things that I care about in addition to the cumulative sum, which you're seeing here, is the delta or the projection, so I can get a better sense of the rate of change for each of these.
That's what I've been hacking on so far.I think of it sort of like a Figma for data — it's really quick and really responsive. I want to add a bunch of sharing and collaboration features. Sadly, I haven't done anything on this in the last month while I've been working more on USDR stuff.
In fact, I'm actually about to start at YC as a visiting partner. So I may have to table this for a little bit, but that's my demo so far.
What are you building that in? And generally, what is your favorite technical toolchain these days?
I feel like I'm trying to get back into the technical toolchain a little bit, because towards the last three or four years of my time at Segment, I did not write a lot of code.
I've been using TypeScript for everything, and Next.js. I like the flexibility of typed JavaScript with the super broad ecosystem that it supports. I use VS Code for most of my editing which works fine. For this USDR thing, I deployed it to Amazon with ECS, and I was reminded of how heavyweight Amazon is and how many things there are to set up. I'm hoping for these new projects — there's a company I invested in called Railway, which I think is doing really interesting things. I'm hoping to spin up some new projects on there.
What are you using to do the graphs themselves?
Right now it's using Plotly. Plotly’s okay. There's this really interesting project out of Airbnb called visx, which is React graph components. Eventually, if I really pursued this seriously, I'd probably want something that's a little more custom. The cool thing about Plotly is it gives you some stuff out of the box, like resizing graphs. I think they do some acceleration with WebGL for large data sets, so that's also nice.
I've gone back to the drawing board a little bit, and now the new piece of technology that I'm really interested in is Apache Arrow. I don't know if you all have seen this, but it's supposed to be really good at doing fast in-memory analytics. And — similar to Oso — they have cross-language compilation.
As you've been writing this, as you've been working on the vaccine project, are there any pain points? Things where you're thinking, "man, I wish someone would solve this already." Or where you think, "we have computers, we should be able to do this."
There've been a number of really interesting ones. I think the big thing that my eyes have been opened to with the US Digital Response work is that they work with a lot of different governments. Primarily folks at the state level, but sometimes local levels too. I've been surprised about how many times people need a general purpose site-builder rather than a database.
Take the problem of actually scheduling vaccines. What every state has had to do is either work with a contractor or build some flow. In a state that I'm working with, doing that involves writing to a Salesforce database and building a bunch of actions that interact with that Salesforce database. You have to do a crazy amount of work! For instance, someone has to create appointments. Then the public has to come in, they have to be able to find appointments. They have to then be able to book them and create a transaction, and you want to be able to follow up over email. It's actually a very complicated flow.
In some states there's eligibility requirements, and it's hard for governments to build these in a way where it doesn't require code. There's a part of me that thinks there should be something like a Retool marketplace. There should be some way of building these building blocks and then giving them up [to governments], but I don't think we've seen that. That's been one interesting one.
The other one, getting back into AWS land, I've been using Terraform for everything just because it's legible and I know it really well. Spinning up things like a VPC and the right subnets and the right security groups and the right load balancers just to get something up and running is... crazy, which is why I'm eager to see tools like Railway get out there and make that experience a lot simpler.
In your working setup, do you have a favorite thing that helps you be productive? A keyboard, monitor or a Pomodoro timer?
I've been liking Roam for logging stuff that I'm doing. But the biggest thing is the bunch of dotfiles that we used at Segment. There's one command that everyone on the team used, which was `goto`. You type `goto` and enter the org/repo and it automatically clones it if you don't already have it, or goes there if you do have it. It all mirrors the GitHub file structure. I think that's probably my favorite dotfile thing ever. That was TJ — one of our first hires — who introduced that. He had a bunch of dotfiles, and I still use a bunch of them — he knows what he's doing when it comes to tooling.
What's some solid advice that you'd give to someone who's trying to get to where you are?
I would focus most on learning constantly. Read code, read books, make sure your job feels like you're constantly taking in new information. It doesn't matter where you start. What matters more is your slope and where you’re able to go. That’s what I’ve found to be the most important thing.