Vint Cerf on the exhilarating mix of thrill and hazard at the frontiers of tech – TechCrunch

Image Credits: KENZO TRIBOUILLARD/AFP via Getty Images / Getty Images

Vint Cerf has been a near-constant influence on the internet since the days when he was helping create it in the first place. Today he wears many hats, among them VP and chief internet evangelist at Google. He is to be awarded the IEEEs Medal of Honor at a gala in Atlanta, and ahead of the occasion he spoke with TechCrunch in a wide-ranging interview touching on his work, AI, accessibility and interplanetary internet.

TechCrunch: To start out with, can you tell us how Google has changed in your time there?

Cerf: Well, when I joined the company in 2005, there were 5,000 people already, which is pretty damn big. And of course, my normal attire is three piece suits. The important thing is that I thought I would be raising the sartorial quotient of the company by joining. And now, almost 18 years later, there are 170-some-odd thousand people, and I have failed miserably. So I hope you dont mind if I take my jacket off.

Go right ahead.

So as you might have noticed, Sergey has come back to do a little bit more on the artificial intelligence side of things, which is something hes always been interested in; I would say historically, weve always had an interest in artificial intelligence. But that has escalated significantly over the past decade or so. The acquisition of DeepMind was a brilliant choice. And you can see some of the outcomes first of the spectacular stuff, like playing Go and winning. And then the more productive stuff, like figuring out how 200 million proteins are folded up.

Then theres the large language models and the chatbots. And I think were still in a very peculiar period of time, where were trying to characterize what these things can and cant do, and how they go off the rails, and how do you take advantage of them to do useful work? How do we get them to distinguish fact from fiction? All of that is in my view open territory, but then thats always an exciting place to be a place where nobodys ever been before. The thrill of discovery and the risk of hazard create a fairly exciting mix an exhilarating mix.

You gave a talk recently about, I dont want to say the dangers of the large language models, but

Well, I did say there are hazards there. I was talking to a bunch of investment bankers, or VCs, and I said, you know, dont try to sell stuff to your investors just because its flashy and shiny. Be cautious about going too fast and trying to apply it without figuring out how to put guardrails in place.

I raised a question of hazard and wanting people to be more thoughtful about which applications made sense. I even suggested an analogy: you know how the Society of Automotive Engineers, they have different risk levels for the self driving cars a risk level idea could apply to artificial intelligence and machine learning.

For entertainment purposes, perhaps its not too concerning, unless it goes down some dark path, in which case, you might want to put some friction into the system to deal with that, especially a younger user. But then, as you get to the point where youre training these things to do medical diagnosis or make investment advice, or make decisions about whether somebody gets out of jail now suddenly, the risk factors are extremely high.

We shouldnt be unaware of those risk factors. We can, as we build applications, be prepared to detect excursions away from safe territory, so that we dont accidentally inflict some harm by the use of these kinds of technologies.

So we need some kind of guardrails.

Again, Im not expert in this space, but I am beginning to wonder whether we need something kind of like that in order to provide a super-ego for the natural language network. So when it starts to go off the rails somewhere, we can observe that thats happening. And a second network thats observing both the input and the output might intervene, somehow, and stop the the production of the output.

Sort of a conscience function?

Well, its not quite conscience, its closer to executive function the prefrontal cortex. I want to be careful, Im only reasoning by metaphor here.

I know that Microsoft has embarked on something like this. Their version of GPT-4 has an intermediary model like that, they call it Prometheus.

Purely as an observation, I had the impression that the Prometheus natural language model would detect and intervene if it thought that the interactions were going down with dark path. I thought that they would implement it in such a way that before you actually say something to the interlocutor that is going down the dark path, you intervene and prevent it from going there at all.

My impression, though, is that it actually produces the output and then discovers that its produced it, but and then it says, Oh, I shouldnt have done that. Oh, dear, I take that back, or I dont want to talk to you anymore about that. Its a little bit like the email that you get occasionally from the Microsoft Outlook system that says, This person would like to withdraw the message.

I love when that happens it makes me want to read the original message so badly, even if I wouldnt have before.

Yeah, exactly. Its sort of like putting a big red flag in there saying, boy theres something juicy in here.

You mentioned the AI models, that its an interesting place to work. Do you get the same sort of foundational flavor that you got from working on protocols and other big shared things over the years?

Well, what we are seeing is emergent properties of these large language models, that are not necessarily anticipated. And there have been emergent properties showing up in the protocol world. Flow control in particular is a vast headache in the online packet switch environment, and people have been tackling these problems inside and outside of Google for years.

One of the examples of emergent properties that I think very few of us thought about is the domain name business. Once they had value, suddenly, all kinds of emergent properties show up, people with interests that conflict and have to be resolved. Same for internet address space, its an even more weird environment where people actually buy IPv4 addresses for like $50 each.

I confess to you that as I watched the auctions for IPv4 address space, I was thinking how stupid I was. When I was at the Defense Department in charge of all this, I should have allocated the slash eight, which is 16 million addresses, to myself, and just sit on it, you know, for 50 years, then sell it and retire.

Even simple systems have the ability to surprise you. Especially when you have simple systems when a large number of them are interacting with each other. Ive found myself not necessarily recognizing when these emergent properties will come, but I will say that whenever something gets monetized, you should anticipate there will be emergent properties and possibly unexpected behavior, all driven by greed.

Let me ask you about some some other stuff youre working on. Im always happy when I see cutting-edge tech being applied to people who need it, people with disabilities, people who like just have not been addressed by the current use cases of tech. Are you still working in the accessibility community?

I am very active in the accessibility space. At Google, we have a number of what we call employee resource groups, or ERGs. Yeah, some of them I, executive sponsor for one for Googlers who have hearing problems. And there is a disabilities oriented group, which involves employees who either have disabilities or family members that have disabilities, and they share their stories with each other because often people have similar problems, but dont know what the solutions were for other people. Also, its just nice to know that youre not alone in some of these challenges. Theres another group called the Grayglers for people that have a little gray in their hair, and Im the executive sponsor for that. And of course, the focus of attention there is the challenges that arise as you get older, even as you think about retirement and things like that.

When a lot of so-called Web 2.0 stuff came out 10 years ago, it was totally inaccessible, broke all the screen readers, all this kind of stuff. Somebody has to step in and say, look, we need to have this standard, or else youre leaving out millions of people. So Im always interested to hear about what interesting projects or organizations or people are out there.

What I have come to believe is that engineers, being just given a set of specs that say if you do it this way, it will meet this level of the standard that doesnt necessarily produce intuition. You really have to have some intuition in order to make things accessible.

So Ive come to the conclusion that what we really need is to show people examples of something which is not accessible, and something that is, and let them ingest as many examples as we can give them, because their neural networks will eventually figure out, what is it about this design that makes it accessible? And how do I apply that insight into the next design that I do? So, seeing what works and what doesnt work is really important. And you often learn a lot more from what doesnt work than you do from what does.

Theres a guy named Gregg Vanderheiden, whos at the University of Maryland, he and I did a two-day event [the Future of Interface Workshop] looking at research on accessibility and trying to frame what this is going to look like over the next 10 or 20 years. It really is quite astonishing what the technology might be able to do to act as an augmenting capability for people that that need assistance. Theres great excitement, but at the same time great disappointment, because we havent used it as effectively as I think we could have. Its kind of like how Alexander Graham Bell invented a telephone that cant be used by people who are deaf, which is why he was working on it in the first place.

It is a funny contradiction of priorities. One thing where I do see some of the the large language and multimodal AI models helping out is that they can describe what they are seeing, even if you cant see it. I know that one of GPT-4s first applications was in an application for blind people to view the world around them.

Were experiencing something close to that right this minute. Since I wear hearing aids, Im making use of the captioning capability. And at the moment since this is Zoom rather than a Google Meet, there isnt any setting on this one for closed captioning. Im exercising the Zoom application through the Chrome browser, and Google has developed a capability for the Chrome browser to detect speech in the incoming sound.

So packets are coming in and theyre known to be sound, it passes through an identification system that produces a caption bar, which you can move around on the screen. And thats been super helpful for me. For cases like this, where the application doesnt have captioning, or for random video streaming video that might be coming in and hasnt been captioned, the caption window automatically pops up. In theory, I think we can do this in 100 different languages, although I dont know that weve activated it for more than four or five. As you say, these tools will become more and more normal, and as time goes on, people will expect the system to adapt to their needs.

So language translation, and speech recognition is quite powerful, but I do want to mention something that I found vaguely unsettling. Recently, I encountered an example of a conversation between a reporter and a chatbot. But he chose deliberately to take the output of the chat bot and have it spoken by the system. And he chose the style of a famous British explorer [David Attenborough].

The text itself was quite well formed, but coming with Attenboroughs accent just added to the weight of the assertions even when they were wrong. The confidence levels, as Im sure youve seen, are very high, even when the thing doesnt know what its talking about.

The reason I bring this up is that we are allowing in these indicators of, how should we say this, of quality, to fool us. Because in the past, they really did mean it was David Attenborough. But here its not, its just his voice. I got to thinking about this, and I realized there was an ancient example of exactly this problem that showed up 50 years ago at Xerox PARC.

They had a laser printer, and they had the Alto workstation, and the Bravo text editor, it meant the first draft of anything you type to be printed out beautifully formatted with lovely forms and everything else. Normally, you would never see that production quality until after everything had been edited, you know, wrestled with by everybody to get the text formatted, picture-perfect stuff. That meant the first draft stuff came out looking like it was final draft. People didnt didnt understand that they were nuts, that they were seeing first-round stuff, and that it wasnt complete, or necessarily even satisfactory.

So it occurred to me that weve reached a point now where technology is fooling us into giving it more weight than it deserves, because of certain indicia that used to be indicative of the investment made in producing it. And Im not quite sure what to do about that.

I dont think anyone is!

I think somehow or another, we need to make it clear what the provenance is of the thing that were looking at. Like how we needed to say this is first-draft material, you know, dont make any assumptions. So provenance turns out to be a very important concept, especially in a world where we have the ability to imbue content with attributes that we would normally interpret in one way. Like, its David Attenborough speaking, and we should listen to that. And yet, which have to be, we have to think more critically about them. Because in fact, the attribute is being delivered artificially.

And perhaps maliciously.

Certainly that too. And this is why critical thinking has become an important skill. But it doesnt work very well, unless you have enough information to understand the provenance of the material that youre looking at. I think we are going to have to invest more in provenance and identity in order to evaluate the quality of that which we are experiencing.

I wanted to ask you about interplanetary internet, because that whole area is extremely interesting to me.

Well, this one, of course, gets started way back in 1998. But Im a science fiction reader from way back way to age 10 or something, so I got quite excited when it was possible to even think about the possibility of designing and building a communication system that would span the solar system.

The team got started very small, and now 25 years later involves many of the space agencies around the world: JAXA, the Korean Space Agency, NASA and so on. And a growing team of people who are either government funded to do space-based research, or volunteers. Theres a special interest group called the interplanetary networking Special Interest Group, which is part of the Internet Society that thing got started in 1998. But it has now grown to like 900 people around the world who are interested in this stuff.

Weve standardized this stuff, were on version seven of it, were running it up in the International Space Station. Its intended to be available for the return to the moon and Artemis missions. Im not going to see the end result of all this, but Im going to see the first couple of chapters. And Im very excited about that, because its not crazy to actually think about. Like all my other projects, it takes a long time. Patience and persistence!

For something like this it must have been a real challenge, but also a very familiar one. In some ways building something like this is what youve been doing your whole career. This is just a different set of restraints and capabilities.

You put your finger on it, exactly right. This is in a different parametric space than the one that works for TCP/IP. And were still bumping into some really interesting problems, especially where you have TCP/IP networks running on the moon, for example, locally and interconnecting with other internets on other planets, going through the interplanetary protocol. What does that look like? You know, which IP addresses should be used? We have to figure out, well, how the hell does the Domain Name System work in the context of internets that arent on the planet? And its really fun!

See the article here:
Vint Cerf on the exhilarating mix of thrill and hazard at the frontiers of tech - TechCrunch

Related Posts

Comments are closed.