intelligence (AGI) nearly in hand?getty
In todays column, I am going to walk you through a prominent AI-mystery that has caused quite a stir leading to an incessant buzz across much of social media and garnering outsized headlines in the mass media. This is going to be quite a Sherlock Holmes adventure and sleuth detective-exemplifying journey that I will be taking you on.
Please put on your thinking cap and get yourself a soothing glass of wine.
The roots of the circumstance involve the recent organizational gyrations and notable business crisis drama associated with the AI maker OpenAI, including the off and on-again firing and then rehiring of the CEO Sam Altman, along with a plethora of related carry-ons. My focus will not particularly be the comings and goings of the parties involved. I instead seek to leverage those reported facts primarily as telltale clues associated with the AI-mystery that some believe sits at the core of the organizational earthquake.
We shall start with the vaunted goal of arriving at the topmost AI.
The Background Of The AI Mystery
So, here's the deal.
Some suggest that OpenAI has landed upon a new approach to AI that either has attained true AI, which is nowadays said to be Artificial General Intelligence (AGI) or that demonstrably resides on or at least shows the path toward AGI. As a fast backgrounder for you, todays AI is considered not yet at the realm of being on par with human intelligence. The aspirational goal for much of the AI field is to arrive at something that fully exhibits human intelligence, which would broadly then be considered as AGI, or possibly going even further into superintelligence (for my analysis on what this AI superhuman aspects might consist of, see the link here).
Nobody has yet been able to find out and report specifically on what this mysterious AI breakthrough consists of (if indeed such an AI breakthrough was at all devised or invented). This situation could be like one of those circumstances where the actual occurrence is a far cry from the rumors that have reverberated in the media. Maybe the reality is that something of modest AI advancement was discovered but doesnt deserve the hoopla that has ensued. Right now, the rumor mill is filled with tall tales that this is the real deal and supposedly will open the door to reaching AGI.
Time will tell.
On the matter of whether the AI has already achieved AGI per se, lets noodle on that postulation. It seems hard to imagine that if the AI became true AGI we wouldnt already be regaled with what it is and what it can do. That would be a chronicle of immense magnitude. Could the AI developers involved be capable of keeping a lid on such a life goal attainment that they miraculously found the source of the Nile or that they essentially turned stone into gold?
Seems hard to believe that the number of people likely knowing this fantastical outcome would be utterly secretive and mum for any considerable length of time.
The seemingly more plausible notion is that they arrived at a kind of AI that shows promise toward someday arriving at AGI. You could likely keep that a private secret for a while. The grand question though looming over this would be the claimed basis for asserting that the AI is in fact on the path to AGI. Such a basis should conceivably be rooted in substantive ironclad logic, one so hopes. On the other hand, perhaps the believed assertion of being on the path to AGI is nothing more than a techie hunch.
Those kinds of hunches are at times hit-and-miss.
You see, this is the way that those ad hoc hunches frequently go. You think youve landed on the right trail, but you are actually once again back in the woods. Or you are on the correct trail, but the top of the mountain is still miles upon miles in the distance. Simply saying or believing that you are on the path to AGI is not necessarily the same as being on said path. Even if you are on the AGI path, perhaps the advancement is a mere inch whilst the distance ahead is still far away. One can certainly rejoice in advancing an inch, dont get me wrong on that. The issue is how much the inch is parlayed into being portrayed intentionally or inadvertently as getting us to the immediate doorstep of AGI.
The Clues That Have Been Hinted At
Now that you know the overarching context of the AI mystery, we are ready to dive into the hints or clues that so far have been reported on the matter. We will closely explore those clues. This will require some savvy Sherlock Holmes AI-considered insights.
A few caveats are worth mentioning at the get-go.
A shrewd detective realizes that some clues are potentially solid inklings, while some clues are wishy-washy or outright misleading. When you are in the fog of war about solving a mystery there is always a chance that you are bereft of sufficient clues. Later on, once the mystery is completely solved and revealed, only then can you look back and discern which clues were on target and which ones were of little use. Alluringly, clues can also be a distraction and take you in a direction that doesnt solve the mystery. And so on.
Given those complications, lets go ahead and endeavor to do the best we can with the clues at this time that seem to be available (more clues are undoubtedly going to leak out in the next few days and weeks; Ill provide further coverage in my column postings as that unfolds).
I am going to draw upon these relatively unsubstantiated foremost three clues:
You can find lots of rampant speculation online that uses only the first of those above clues, namely the name of Q*. Some believe that the mystery can be unraveled on that one clue alone. They might not know about the other two above clues. Or they might not believe that the other two clues are pertinent.
I am going to choose to use all three clues and piece them together in a kind of mosaic that may provide a different perspective than others have espoused online about the mystery. Just wanted to let you know that my detective work might differ somewhat from other narratives you might read about elsewhere online.
The First Clue Is The Alleged Name Of The AI
It has been reported widely that the AI maker has allegedly opted to name the AI software as being referred to by the notation of a capital letter Q that is followed by an asterisk.
The name or notation is this: Q*.
Believe it or not, by this claimed name alone, you can go into a far-reaching abyss of speculation about what the AI is.
I will gladly do so.
I suppose it is somewhat akin to the word Rosebud in the famous classic film Citizen Kane. I wont spoil the movie other than to emphasize that the entire film is about trying to make sense of the seemingly innocuous word of Rosebud. If you have time to do so, I highly recommend watching the movie since it is considered one of the best films of all time. There isnt any AI in it, so realize you would be watching the movie for its incredible plot, splendid acting, eye-popping cinematography, etc., and relishing the deep mystery ardently pursued throughout the movie.
Back to our mystery in hand.
What can we divine from the Q* name?
Those of you who are faintly familiar with everyday mathematical formulations are likely to realize that the asterisk is typically said to represent a so-called star symbol. Thus, the seemingly Q-asterisk name would conventionally be pronounced aloud as Q-star rather than as Q-asterisk. There is nothing especially out of the ordinary in mathematical notations to opt to make use of the asterisk as a star notation. It is done quite frequently, and I will shortly explain why this is the case.
Overall, the use specifically of the letter Q innately coupled with the star representation does not notably denote anything already popularized in the AI field. Ergo, I am saying that Q* doesnt jump out as meaning this particular AI technique or that particular AI technology. It is simply the letter Q that is followed by an asterisk (which we naturally assume by convention represents a star symbol).
Aha, our thinking caps now come into play.
We will separate the letter Q from its accompanying asterisk. Doing so is seemingly productive. Heres why. The capital letter Q does have significance in the AI field. Furthermore, the use of an asterisk as a star symbol does have significance in the mathematics and computer science arena. By looking at the significance of each distinctly, we can subsequently make a reasonable leap of logic as a result of considering the meaning associated when they are combined in unification.
I will start by unpacking the use of the asterisk.
What The Asterisk Or Star Symbol Signifies
One of the most historically well-known uses of the asterisk in a potentially similar context was the use by the mathematician Stephen Kleene when he defined something known as V*. You might cleverly observe that this notation consists of the capital letter V that is followed by the asterisk. It is pronounced as V-star.
In his paper published in the 1950s, he described that suppose you had a set of items that were named by the capital letter V, and you then decided to make a different set that consisted of various combinations associated with the items that are in the set V. This new set will by definition contain all the elements of set V and will show them furthermore in as many concatenated ways as we can come up with. The resulting new set will be denoted as V* (there are other arcane rules about this formulation, but I am only seeking to give a brief tasting herein).
As an example about this matter, suppose that I had a set consisting of the first three lowercase letters of the alphabet: {a, b, c}. I will go ahead and refer to that set as the set V. We have a set V that consists of {a, b, c}.
You are then to come up with V* by making lots of combinations of the elements in V. You are allowed to repeat the elements as much as you wish. Thus, the V* will contain elements like this: {a, b, c, ab, ac, ba, bc, aa, bb, cc, aaa, aab, aac, }.
I trust that you see that the V* is a combination of the elements of V. This V* is kind of amazing in that it has all kinds of nifty combinations. I am not going to get into the details of why this is useful and will merely bring your attention to the fact that the asterisk or star symbol suggests that whatever set V you have there is another set V* that is much richer and fuller. I would recommend that those of you keenly interested in mathematics and computer science might want to see a classic noteworthy article by Stephen Kleene entitled "Representation of Events in Nerve Nets and Finite Automata" which was published by Princeton University Press in 1956. You can also readily find lots of explanations online about V*.
Your overall takeaway here is that when you use a capital letter and join it with an asterisk, the conventional implication in mathematics and computer science is that you are saying that the capital letter is essentially supersized. You are magnifying whatever the original thing is. To some degree, you are said to be maximizing it to the nth degree.
Are you with me on this so far?
I hope so.
Lets move on and keep this asterisk and star symbol stuff in mind.
The Use Of Asterisk Or Star In The Case Of Capital A
You are going to love this next bit of detective work.
Ive brought you up-to-speed about the asterisk and showed you an easy example involving the capital letter V. Well, in the AI field, there is a famous instance that involves the capital letter A. We have hit a potential jackpot regarding the underlying mystery being solved, some believe.
Allow me to explain.
The famous instance of the capital letter A which is accompanied by an asterisk in the field of AI is shown this way: A*. It is pronounced as A-star.
As an aside, when I was a university professor, I always taught A* in my university classes on AI for undergraduates and graduates. Any budding computer science student learning about AI should be at least aware of the A* and what it portends. This is a foundational keystone for AI.
In brief, a research paper in the 1960s proposed an AI foundational approach to a difficult mathematical problem such as trying to find the shortest path to get from one city to another city. If you are driving from Los Angeles to New York and you have lets assume thirty cities that you might go through to get to your destination, which cities would you pick to minimize the time or distance for your planned trip?
You certainly would want to use a mathematical algorithm that can aid in calculating the best or at least a really good path to take. This also relates to the use of computers. If you are going to use a computer to figure out the path, you want a mathematical algorithm that can be programmed to do so. You want that mathematical algorithm to be implementable on a computer and run as fast as possible or use the least amount of computing resources as you can.
The classic paper that formulated A* is entitled A Formal Basis for the Heuristic Determination of Minimum Cost Paths by Peter Hart, Nils Nilsson, and Bertram Raphael, published in IEEE Transactions on Systems Science and Cybernetics, 1968. The researchers said this:
The paper proceeds to define the algorithm that they named as A*. You can readily find online lots and lots of descriptions about how A* works. It is a step-by-step procedure or technique. Besides being useful for solving travel-related problems, the A* is used for all manner of search-related issues. For example, when playing chess, you can think of finding the next chess move as a search-related problem. You might use A* and code it into part of a chess-playing program.
You might be wondering whether the A* has a counterpart possibly known as simply A. In other words, I mentioned earlier that we have V* which is a variant or supersizing of V. Youll be happy to know that some believe that A* is somewhat based on an algorithm which is at times known as A.
Do tell, you might be thinking.
In the 1950s, the famous mathematician and computer scientist Edsger Dijkstra came up with an algorithm that is considered one of the first articulated techniques to figure out the shortest paths between various nodes in a weighted graph (once again, akin to the city traveling problem and more).
Interestingly, he figured out the algorithm in 1956 while sitting in a caf in Amsterdam and according to his telling of how things arose, the devised technique only took about twenty minutes for him to come up with. The technique became a core part of his lifelong legacy in the field of mathematics and computer science. He took his time to write it up. He published a paper about it three years later, and it is a highly readable and mesmerizing read, see E. W. Dijkstra, "A Note on Two Problems in Connection with Graphs", published in Numerische Mathematik, 1959.
Some have suggested that the later devised A* is essentially based on the A of his works. There is a historical debate about that. What can be said with relative sensibility is that the A* is a much more extensive and robust algorithm for doing similar kinds of searches. Ill leave things there and not get mired in the historical disputes.
Id like to add two more quick comments about the use of the asterisk symbol in the computer field.
First, those of you who happen to know coding or programming or the use of computer commands are perhaps aware that a longstanding use of the asterisk has been as a wildcard character. This is pretty common. Suppose I want to inform you that you are to identify all the words that can be derived based on the root word or letters dog. For example, you might come up with the word doggie or the word dogmatic. I could succinctly tell you what you can do by putting an asterisk at the end of the root word, like this: dog*. The asterisk is considered once again to be a star symbol and implies that you can put whatever letters you want after the first fixed set of three letters of dog.
Secondly, another perspective on the asterisk when used with a capital letter is that it is the last or furthest possible iteration or version of something. Lets explore this. Suppose I make a piece of software and I decide to refer to it via the capital letter B. My first version might be referred to as B1. My second version might be referred to as B2. On and on this goes. I might later on have B26, the twenty-sixth version, and much later maybe B8245 which is presumably the eight thousand two hundred forty-fifth version.
A catchy or cutesy way to refer to the end of all of the versions might be to say B*. The asterisk or star symbol in this case tells us that whatever is named as B* is the highest or final of all of the versions that we could ever come up with.
I will soon revisit these points and show you why they are part of the detective work.
The Capital Letter Q Is Considered A Hefty Clue
You are now aware of the asterisk or star symbol. Congratulations!
We need to delve into the capital letter Q.
The seemingly most likely reference to the capital letter Q that exists in the field of AI would indubitably be something known as Q-learning. Some have speculated that the Q might instead be a reference to the work of the famous mathematician Richard Bellman and his optimal value function in the Bellman equation. Sure, I get that. We dont know if thats the reference being made. Im going to make a detective instinctive choice and steer toward the Q that is in Q-learning.
Im using my Ouija board to help out.
Sometimes it is right, sometimes it is wrong.
Q-learning is an important AI technique. Once again, it is a topic that I always covered in my AI classes and that I expected my students to know by heart. The technique makes use of reinforcement learning. You are already generally aware of reinforcement learning by your likely life experiences.
Lets make sure you are comfortable with the intimidatingly fancy phrase reinforcement learning.
Suppose you are training a dog to perform a handshake or shall we say paw shake. You give the dog a verbal command such as telling the cute puppy to do a handshake. The dog lifts its tiny paw to touch your outreached hand. To reward this behavior, you give the dog a delicious canine treat.
You continue doing this repeatedly. The dog is rewarded with a treat for each time that it performs the heartwarming trick. If the dog doesnt do the trick when commanded, you dont provide the treat. In a sense, the denial of a treat is almost a penalty too. You could have a more explicit penalty such as scowling at the dog, but usually, the more advisable course of action is to focus on rewards rather than also including explicit penalties.
All in all, the dog is being taught by reinforcement learning. You are reinforcing the behavior you desire by providing rewards. The hope is that the dog is somehow within its adorable canine brain getting the idea that doing a handshake is a good thing. The internal mental rules that the dog is perhaps devising are that when the command to do a handshake is spoken, the best bet is to lift its handy paw since doing so is amply rewarded.
Q-learning is an AI technique that seeks to leverage reinforcement learning in a computer or is said to be implemented computationally.
The algorithm consists of mathematically and computationally examining a current state or step and trying to figure out which next state or step would be the best to undertake. Part of this consists of anticipating the potential future states or steps. The idea is to see if the rewards associated with those future states can be added up and provide the maximum attainable reward.
You presumably do something like this in real life.
Consider this. If I choose to go to college, I might get a better-paying job than if I dont go to college. I might also be able to buy a better house than if I didnt go to college. There are lots of possible rewards so I might add them all up to see how much that might be. That is one course or sequence of steps and maybe it is good for me or maybe there is something better.
If I dont go to college, I can start working in my chosen field of endeavor right away. I will have four years of additional work experience prior to those that went to college. It could be that those four years of experience will give me a long-lasting advantage over having used those years to go to college. I consider the down-the-road rewards associated with that path.
Upon adding up the rewards for each of those two respective paths, I might decide that whichever path has the maximum calculated reward is the better one for me to pick. You might say that I am adding up the expected values. To make things more powerful, I might decide to weight the rewards. For example, I mentioned that I am considering how much money I will make. It could be that I also am considering the type of lifestyle and work that I will do. I could give greater weight to the type of lifestyle and work while giving a bit less weight to the money side of things.
The formalized way to express all of this is that an agent, which in the example is me, will be undertaking a series of steps, which we will denote as states, and taking actions that transition the agent from one state to the next state. The goal of the agent entails maximizing a total reward. Upon each state or step taken, a reevaluation will occur to recalculate which next step or state seems to be the best to take.
Notice that I did not beforehand know for sure which would be the best or right steps to take. I am going to make an estimate at each state or step. I will figure things out as I go along. I will use each reward that I encounter as a further means to ascertain the next state or step to take.
Given that description, I hope you can recognize that perhaps the dog that is learning to do a handshake is doing something similar to this (we cant know for sure). The dog has to decide at each repeated trial whether to do the handshake. It is reacting in the moment, but also perhaps anticipating the potential for future rewards too. We do not yet have a means to have the dog tell us what it is thinking so we dont know for sure what is happening in that mischievous canine mind.
I want to proffer a few more insights about Q-learning and then we will bring together everything that I have so far covered. We need to steadfastly keep in mind that we are on a quest. The quest involves solving the mystery of the alleged AI that might be heading us toward AGI.
Q-learning is often depicted as making use of a model-free and off-policy approach to reinforcement learning. Thats a mouthful. We can unpack it.
Here are some of my off-the-cuff definitions that are admittedly loosey-goosey but I believe are reasonably expressive of the model and policy facets associated with Q-learning (I ask for forgiveness from the strict formalists that might view this as somewhat watered down):
Take a look at those definitions. I have noted in italics the model-free and the off-policy. I also gave you the opposites, namely model-based and the on-policy approaches since those are each respectively potentially contrasting ways of doing things. Q-learning goes the model-free and off-policy route.
The significance is that Q-learning proceeds on a trial-and-error basis (considered to be model-free) and tries to devise rules while proceeding ahead (considered to be off-policy). This is a huge plus for us. You can use Q-learning without having to in advance come up with a pre-stipulated model of how it is supposed to do things. Likewise, you dont have to come up with a bunch of rules beforehand. The overall algorithm proceeds to essentially get things done on the fly as the activity proceeds and self-derives the rules. Of related noteworthiness is that the Q-learning approach makes use of data tables and data values that are known as Q-tables and Q-values (i.e., the capital letter Q gets a lot of usage in Q-learning).
Okay, I appreciate that you have slogged through this perhaps obtuse or complex topic.
Your payoff is next.
The Mystery Of Q* In Light Of Q And Asterisks
You now have a semblance of what an asterisk means when used with a capital letter. Furthermore, I am leaning you toward assuming that the capital letter Q is a reference to Q-learning.
Lets jam together the Q and the asterisk and see what happens, namely this: Q*.
The combination might mean this. The potential AI breakthrough is labeled as Q because it has to do with the Q-learning technique, and maybe the asterisk or star symbol is giving us a clue that the Q-learning is somehow been advanced to a notably better version or variant. The asterisk might suggest that this is the highest or most far-out capability of Q-learning that anyone has ever seen or envisioned.
Wow, what an exciting possibility.
This would imply that the use of reinforcement learning as an AI-based approach and that is model-free and off-policy can leap tall buildings and go faster than a speeding train (metaphorically) to being able to push AI closer to being AGI. If you place this into the context of generative AI such as ChatGPT by OpenAI and GPT-4 of OpenAI, perhaps those generative AI apps could be much more fluent and seem to convey reasoning if they had this Q* included into them (or this might be included into the GPT-5 that is rumored to be under development).
If only OpenAI has this Q* breakthrough (if there is such a thing), and if the Q* does indeed provide a blockbuster advantage, presumably this gives OpenAI a substantial edge over their competition. This takes us to an intriguing and ongoing AI ethics question. For my ongoing and extensive coverage of AI ethics and AI law, see the link here and the link here, just to name a few.
More:
About That Mysterious AI Breakthrough Known As Q* By OpenAI That Allegedly Attains True AI Or Is On The Path Toward Artificial General Intelligence...
Read More..