AI Is Just A Majordomo

The IT world is on fire right now with solutions to every major problem we’ve ever had. Wouldn’t you know it that the solution appears to be something that people are very intent on selling to you? Where have I heard that before? You wouldn’t know it looking at the landscape of IT right now but AI has iterated more times than you can think over the last couple of years. While people are still carrying on about LLMs and writing homework essays the market has moved on to agentic solutions that act like employees doing things all over the place.

The result is people are more excited about the potential for AI than ever. Well, that is if you’re someone that has problems that need to be solved. If you’re someone doing something creative, like making art or music or poetry you’re worried about what AI is going to do to your profession. That divide is what I’ve been thinking about for a while. I don’t think it should come as a shock to anyone but I’ve figured out why AI is hot for every executive out there.

AI appeals to people that have someone doing work for them.

The Creative Process

I like writing. I enjoy coming up with fun synonyms and turns of phrase and understanding a topic while I create something around it. Sure, the process of typing the words out gets tedious. Finding the time to do it even more so, especially this year. I wouldn’t trade writing for anything because it helps me express thoughts in a way that I couldn’t before.

I know that I love writing because whenever I try to teach an AI agent to write like me I find the process painful. The instruction list is three pages long. You feed the algorithm a bunch of your posts and tell it to come up with an outline of how you write. What comes out the other side sounds approximately like you but misses a lot of the points. I think my favorite one was when I had an AI analyze one of my posts and it said I did a good job but needed to leave off my Tom’s Take at the end. When I went back to create an outline for training an AI to write like me the outline included leaving a summary at the end. Who knew?

People love the creative process. Whether it’s painting or woodworking or making music creative people want to feel like they’ve accomplished something. They want to see the process unfold. The magic happens on the journey from beginning to end. Feel free to insert your favorite cliche about the journey here. A thing worth doing is worth taking your time to do it.

Domo Arigato, Majordomo

You know who doesn’t love that process? Results-oriented people. You know the ones. The people that care more about the report being on time than the content. The people that need an executive summary at the beginning because they can’t be bothered to read the whole thing. The kind of people that flew the Concorde back in the day because they needed to be in New York with a minimum of delay. You’re probably already picturing these people in your head with suits and wide tie knots and a need to ensure the board sees things their way.

Executives, managers, and the like love AI. Because it replicates their workflow perfectly. They don’t create. They have others create. They don’t want to type or write or draw. They want to see the results and leverage them for other things. The report is there if you want to read it but they just need the summary so they can figure out what to do with it. Does it matter whether they’re asking a knowledge worker or an AI agent to create something?

The other characteristic of those people, especially as you go up the organizational chart, is their inability to discern bad information. They work from the assumption that everything presented in the report is accurate. The people that were doing it for them before were almost always accurate. Why wouldn’t the fancy new software be just as accurate? Of course, if the knowledge worker gave bad data to the executive they could be fired or disciplined for it. If the AI lies to the CEO what are they going to do? Put it in time out? The LLM or agent doesn’t even know what time out is.

People that have other people do things for them love AI. They want the rest of us to embrace it too because then we all have things doing work for us and that means they can realign their companies for maximum profit and productivity. The reliance on these systems creates opportunities for problems. I used the term majordomo in the title for a good reason. The kinds of people that have a majordomo (or butler) are exactly the kinds of people that salivate about AI. It’s always available, never wants to be complimented or paid, and probably gives the right information most of the time. Even if it doesn’t, who is going to know? Just ask another AI if it’s true.


Tom’s Take

The dependence on these systems means that we’re forgetting how to be creative. We don’t know how to build because something is building for us. Who is going to come up with the next novel file open command in Python or creative metaphor if we just rely on LLMs to do it for us now? We need to break away from the idea that someone needs to do things for us and embrace the idea of doing them. We learn the process better. We have better knowledge. And the more of them we do the more we realize what actually needs to be done. The background noise of AI agents doing meaningless tasks doesn’t make them go away. They just get taken care of by the artificial majordomos.

Don’t Let AI Make You Circuit City

I have a little confession. Sometimes I like to go into Best Buy and just listen. I pretend to be shopping or modem bearings or a left handed torque wrench. What I’m really doing is hearing how people sell computers. I remember when 8x CD burners were all the rage. I recall picking one particular machine because it had an integrated Sound Blaster card. Today, I just marvel at how the associates rattle off a long string of impressive sounding nonsense that consumers will either buy hook, line, and sinker or refute based on some Youtube reviewer recommendation. Every once in a while, though, I hear someone that actually does understand the lingo and it is wonderful. They listen and understand the challenges and don’t sell a $3,000 gaming computer to a grandmother just to play Candy Crush and look up grandkid photos on Facebook.

The Experience Matters

What does that story have to do with the title of this post? Well, dear young readers, you may not remember the time when Best Buy Blue was locked in mortal competition with Circuit City Red. In a time before Amazon was ascendant you had to pick between the two giants of big box tech retail. You may remember that Circuit City went out of business in 2009 thanks to the economic conditions of the time, but the real downfall of the company happened years earlier.

One of the things that set Circuit City apart from everyone else was their sales staff. They earn commissions based on helping customers. That meant they had to know their stuff to keep making money. And the very best of them could make a LOT of money. It also contributed a lot to the performance of the stores. The very best of the best were making a dent in the profit margins of the stores. What should management do about that?

If you guessed something sane and positive, you’d be wrong. In 2003, they eliminated their commissioned sales staff and fired nearly 4,000 of the best. You can just imagine what happened next. Sales plummeted. The associates left behind weren’t the top performers. They struggled to hit the revenue targets. Management panicked. They tried to rehire the furloughed overachievers at entry-level hourly rates. There was raucous laughter and lots of middle fingers. And five years later Circuit City collapsed into fodder for Youtube historians to analyze.

What doomed Circuit City was not an economy bubble popping. It wasn’t Amazon or the rise of independent influencers. It wasn’t cheap parts or dupes of the best camcorders and tape recorders. It was the hubris to think that the people that had spent their careers learning the ins-and-outs of technology were replaceable but less skill and less cost to the business. Inexperience may sound impressive to those that don’t understand but the knowledgable customer knows the difference. The Circuit City execs learned that lesson the hard way. But our old friend George Santayana has a new generation to teach.

Repeating The Past

How could you possibly decide to fire your best performers and replace them with something cheap that spouts out approximate answers that look correct but are ultimately useless when applied in reality? What kind of CEO would think about that just to shave some numbers off the bottom line in the name of Shareholder Value?

Oh. Yeah.

AI.

LLMs are making advances by leaps and bounds compared to just eighteen months ago. But they are not a replacement for people that understand the actual technology. LLMs don’t learn the way that people learn. They are trained and refined to find better solutions to problems but they don’t “learn”. They just get slightly better over time about not putting adverbs in every sentence they write. People make math mistakes that blow up switches and routers. LLMs eventually learn that the word “double” doesn’t always mean double.

To an executive, LLMs sound impressive. They’re filled with impressive words that mean a lot of nothing. To knowledge workers, LLMs create approximations of words that have no meaning. Nowhere is this more apparent in the fact that we’ve created entire acronyms of things like Retrieval Augmented Generation (RAG) to reduce the likelihood that an LLM will just make something up because it sounds good. If you need someone other than just make something up because it’s what an executive wants to hear I’m way cheaper than any GPU cluster that NVIDIA is shipping today, including power consumption costs.

Shaving Dollars and Sense

Circuit City thought that customers wouldn’t know the difference between Bob the Sales Juggernaut’s expansive knowledge of DVD players and Billy the New Guy’s attempts to sound as impressive as Bob. The kinds of people that will pay top dollar for a plasma TV know the difference. The kinds of people that rely on Bob to tell them what to buy because they’re too busy to shop don’t. Circuit City thought they could save on the bottom line by removing experience and they removed the entire bottom line, along with everything above it too.

The rumblings are already there in the market. Entry level tasks will be handled by AI so we can focus on higher-order thinking. The real value is going to be in the way that experts solve problems. At least until we figure out the next thing after LLMs which can try to approximate the thinking of those experts. Then we start the cycle all over again. And those experts? Do you know where they got the knowledge? But doing the meaningless entry-level tasks until they mastered them. They learned as they worked. They tweaked their own algorithms without the need for fifty new GPUs.


Tom’s Take

Thinking you can replace experience with cheap substitutes leads to disaster every single time. “Good enough” isn’t good enough when people know enough about the subject to understand they’re hearing garbage. In fact, I’d argue that AI might be good enough to do the one thing that Circuit City didn’t figure out for years. If your executive team is so great at making poor decisions that they could be replaced by a soulless software program, maybe they should be replaced instead. You might still go out of business eventually but the reduced salaries at the top might keep the lights on a little longer. Who knows? Maybe AI could learn a thing or two that way.

AI Should Be Concise

One of the things that I’ve noticed about the rise of AI is that everything feels so wordy now. I’m sure it’s a byproduct of the popularity of ChatGPT and other LLMs that are designed for language. You’ve likely seen it too on websites that have paragraphs of text that feel unnecessary. Maybe you’re looking for an answer to a specific question. You could be trying to find a recipe or even a code block for a problem. What you find is a wall of text that feels pieced together by someone that doesn’t know how to write.

The Soul of Wit

I feel like the biggest issue with those overly word-filled answers comes down to the way that people feel about unnecessary exposition. AI is built to write things on a topic and fill out word count. Much like a student trying to pad out the page length for a required report, AI doesn’t know when to shut up. It specifically adds words that aren’t really required. I realize that there are modes of AI content creation that value being concise but those are the default.

I use AI quite a bit to summarize long articles, many of which I’m sure were created with AI-assistance in the first place. AI is quite adept at removing the unneeded pieces, likely because it knows where there are inserted in the first place. It took me a while to understand why this bothered me so much. What is it about having a computer spend way too much time explaining answers to you that feels wrong?

Enterprise D Bridge

Then it hit me. It felt wrong because we already have a perfect example of what an intelligence should feel like when it answers you. It comes courtesy of Gene Roddenberry and sounds just like his wife Majel Barrett-Roddenberry. You’ve probably guessed that it’s the Starfleet computer system found on board every Federation starship. If you’ve watched any series since Next Generation you’ve heard the voice of the ship computer executing commands and providing information to the crew members, guests, and even holographic projections.

Why is the Star Trek computer a better example of AI behavior to me? In part because it provides information in the most concise manner possible. When the captain asks a question the answer is produced. No paragraphs necessary. No use of delve or convolutional needed. It produces the requested info promptly. Could you imagine a ship’s computer that drones on for three paragraphs before telling the first officer that the energy pulse is deadly and the shields need to be raised?

Quality Over Quantity

I’m sure you already know someone that thinks they know a lot about a subject and are more than happy to tell you about what they know. Do they tend to answer questions or explain concepts tersely? Or do they add in filler words and try to talk around tricky pieces in order to seem like they have more knowledge than they actually do? Can you tell the difference? I’m willing to be that you can.

That’s why GPT-style LLM content creation feels so soulless. We’re conditioned to appreciate precision. The longer someone goes on about something the more likely we are to either tune out or suspect it’s not an accurate answer. That’s actually a way that interrogators are trained to uncover falsehoods and lies. People stretching the truth are more likely to use more words in their statements.

There’s also more reasoning behind the padding. Think about how many ads are usually running on sites that have this kind of AI-generated content. Is it just a few? Or as many as possible inserted between every possible paragraph. It’s not unlike video sites like Youtube having ads inserted at certain points in the video. If you insert an additional ad in a video that is a minimum of twenty minutes how long do you think the average video is going to be for channels that rely on ad revenue? The actual substance of the content isn’t as important as getting those extra ad clicks.


Tom’s Take

It’s unlikely that my ramblings about ChatGPT is going to change things any time soon. I’d rather have the precision of Star Trek over the hollow content that creates yarns about family life before getting to the actual recipe. Maybe I’m in the minority. But I feel like my audience would prefer getting the results they want and doing away with the unnecessary pieces. Could this blog post have been a lot shorter and just said “Stop being so wordy”? Sure. But it’s long because it was written by a human.

The Legacy of Cisco Live

Legacy: Something transmitted by or received from an ancestor or predecessor or from the past. — Merriam-Webster

Cisco Live 2024 is in the books. I could recap all the announcements but that would take forever. You can find an AI that can summarize them for you much faster. That’s because AI was the largest aspect of what was discussed. Love it or hate it, AI has taken over the IT industry for the time being. More importantly it has also focused companies on the need to integrate AI functions into their product lines to avoid being left behind by upstarts.

That’s what you see in the headlines. Something I noticed while I was there was how the march of time has affected us all. After eighteen years I finally realized the sessions today have less in common with the ones I was attending back in 2010 than ever before. Development and advanced features configuration have replaced the tuning of routing protocols and CallManager deployment tips. It’s a game for younger engineers that have less to unlearn from the legacy technologies I’ve spent my career working on.

Leaving a Legacy

But legacy is a word with more than one definition. It’s easy to think of legacy as old technology or technical debt. But it can also be something you leave to the next generation, as the definition at the top of this post says. What we leave behind for those we teach and lead is as important as any system out there. Because those lessons persist long after the technology has fallen away.

For the first time that I could remember, my friends were bringing their kids to the show. Not to enjoy a vacation or to hang out by the pool. They were coming because it was time for them to step forward and lean and make connections in the industry. Folks like Jody Lemoine, Rita Younger, Martin Duggan, and Brandon Carroll shared the passion and excitement of Cisco Live with their older children as a way to help them grow.

We’re not done with our careers yet but we are at the point where it’s time to show those behind us the path. It is no longer a race to consume knowledge as quickly as possible and put it into use. It’s about helping people by leveraging our legacy to teach them and help them along the way. Our group welcomed the kids with open arms. We talked to them, shared our perspectives, and made them feel welcome. We showed them the same courtesy that was shown to use years before.

Inspiring Others

The legacy of Cisco Live is more than just teaching the next generation. It’s seeing the way that the conference has transformed. I will admit that my activity on social media is a pale comparison of what it used to be. The face of Cisco Live is now influencers like Lexie Cooper and Alexis Bertholf that have embraced new platforms and found their voice to share content with others in a way is comfortable for them to consume it. The number of people that want to read a long blog post is waning. Concepts are communicated in short bursts. That’s where the next generation excels.

Seeing people running across the show floor to meet new creators like Kevin Nanns reminded me of a time when I was doing the same thing. I wanted to know everyone that I could to learn as much as possible. Now I get to see others doing the same and smile. New face are meeting their heroes and building their communities. The process continues no matter the platform. People find their voice and share with others. Whether it’s a podcast or TikTok or a casual conversation over lunch. It’s about making those connections and keeping them going.


Tom’s Take

That’s where I started. That’s why I do it. To meet new people and help them build a community. I have my community of wonderful Cisco Live people. I have Tech Field Day. I have The Corner, which is my most lasting Cisco Live legacy. I’m excited to see so many people passing their legacy along to the next generation. I love seeing new faces in the creator space popping up to share their stories and their journeys. Cisco Live will be in San Diego in 2025 and I can’t wait to see who shows up and what legacy they’ll leave.

Butchering AI

I once heard a quote that said, “The hardest part of being a butcher is knowing where to cut.” If you’ve ever eaten a cut of meat you know that the difference between a tender steak and a piece of meat that needs hours of tenderizing is just inches apart. Butchers train for years to be able to make the right cuts in the right pieces of meat with speed and precision. There’s even an excellent Medium article about the dying art of butchering.

One thing that struck me in that article is how the art of butchering relates to AI. Yes, I know it’s a bit corny and not an easy segue into a technical topic but that transition is about as subtle as the way AI has come crashing through the door to take over every facet of our lives. It used to be that AI was some sci-fi term we used to describe intelligence emerging in computer systems. Now, AI is optimizing my PC searches and helping with image editing and creation. It’s easy, right?

Except some of those things that AI promises to excel at doing are things that professionals have spent years honing their skills at performing. Take this article announcing the release of the Microsoft CoPilot+ PC. One of the things they are touting as a feature is using neural processing units (NPUs) to allow applications to automatically remove the background from an image in a video clip editor. Sounds cool, right? Have you ever tried to use an image editor to remove or blur the background of an image? I did a few weeks ago and it was a maddening experience. I looked for a number of how-to guides and none of them had good info. In fact, most of the searches just led me to apps that claimed to use some form of AI to remove the background for me. Which isn’t what I wanted.

Practice Makes Perfect

Bruce Lee said, “I fear not the man who has practiced 10,000 kicks once, but I fear the man who has practiced one kick 10,000 times.” His point was that practice of a single thing is what makes a professional stand apart. I may know a lot about history, for example, but I’ll never be as knowledgeable about Byzantine history as someone who has spent their whole career studying it. Humans develop skills via repetition and learning. It’s how our brains are wired. We pick out patterns and we reinforce them.

AI attempts to simulate this pattern recognition and operationalize it. However, the learning process that we have simulated isn’t perfect. AI can “forget” how to do things. Sometimes this is built into the system with something like unstructured learning. Other times it’s a failure of the system inputs, such as a corrupted database or connectivity issue. Either way the algorithm defaults back to a state of being a clean slate with no idea how to proceed. Even on their worst days a butcher or a plumber never forgets how to do their job, right?

The other maddening thing is that the AI peddlers try to convince everyone that teaching their software means we never have to learn ever again. After all, the algorithm has learned everything and can do it better than a human, right? That’s true, as long as the conditions don’t change appreciably. It reminds me of signature-based virus detection from years ago. As long as the infection matched the definition you could detect it. As soon as it changed the code and became polymorphous it was undetectable. That led to the rise of heuristic-based detections and eventually to the state of endpoint detection and response (EDR) we have today.

That’s a long way to say that the value in training someone to do a job isn’t in them gaining just the knowledge. It’s about training them to apply that knowledge in new situations and extrapolate from incomplete data. In the above article about the art of butchering, the author mentions that he was trained on a variety of animals and knows where the best cuts are for each. That took time and effort and practice. Today’s industrialized butcher operations train each person to make a specific cut. So the person cutting a ribeye steak doesn’t know how to make the cuts for ribs or cube steaks. They would need to be trained on that input in order to do the task. Not unlike modern AI.


Tom’s Take

You don’t pay a butcher for a steak. You pay them for knowing how to cut the best one. AI isn’t going to remove the need for professionals. It’s going to make some menial tasks easier to do but when faced with new challenges or the need to apply skills in an oblique way we’re still going to need to call on humans trained to think outside the box to do it without hours and days of running simulations. The human brain is still unparalleled in its ability to adapt to new stimuli and apply old lessons appropriately. Maybe you can train an AI to identify the best parts of the cow but I’ll take the butcher’s word for it.

Copilot Not Autopilot

I’ve noticed a trend recently with a lot of AI-related features being added to software. They’re being branded as “copilot” solutions. Yes, Microsoft Copilot was the first to use the name and the rest are just trying to jump in on the brand recognition, much like using “GPT” last year. The word “copilot” is so generic that it’s unlikely to be to be trademarked without adding more, like the company name or some other unique term. That made me wonder if the goal of using that term was simply to cash in on brand recognition or if there was more to it.

No Hands

Did you know that an airplane can land entirely unassisted? It’s true. It’s a feature commonly called Auto Land and it does exactly what it says. It uses the airports Instrument Landing System (ILS) to land automatically. Pilots rarely use it because of a variety of factors, including the need for minute last-minute adjustments during a very stressful part of the flight as well as the equipment requirements, such as a fairly modern ILS system. That doesn’t even mention that use of Auto Land snarls airport traffic because of the need to hold other planes outside ILS range to ensure only one plane can use it.

The whole thing reminds me of when autopilot is used on most flights. Pilots usually take the controls during takeoff and landing, which are the two more critical phases of flight. For the rest, autopilot is used a lot of the time. That’s the boring sections where you’re just flying a straight line between waypoints on your flight plan. That’s something that automated controls excel at doing. Pilots can monitor but don’t need to have their full attention on the readings every second of the flight.

Pilots will tell you that taking the controls for the approach and landing is just smart for many reasons, chief among them that it’s something they’re trained to do. More importantly, it places the overall control of the landing in the hands of someone that can think creatively and isn’t just relying on a script and some instrument readings to land. Yes, that is what ILS was designed to do but someone should always be there to ensure that what’s been sent is what should be followed.

Pilot to Copilot

As you can guess, the parallels in this process for using AI in your organization are a good match. AI may have great suggestions and may even come up with more novel ways of making you more productive but it’s not the only solution to your problems. I think the copilot metaphor is perfectly illustrated with the rush to have GPT chatbots write reports and articles last year.

People don’t like writing. At least, that’s the feeling that I got when I saw how many people were feeding prompts to OpenAI and having it do the heavy lifting. Not every output was good. Some of it was pretty terrible. Some of it was riddled with errors. And even the things that looked great still had that aura of something like the uncanny valley of writing. Almost right but somehow wrong.

Part of the reason for that was the way that people just assumed that the AI output was better than anything they could have come up with and did no further editing to the copy. I barely trust my own skills to publish something with minimal editing. Why would a trust a know-it-all computer algorithm? Especially with something that has technical content? Blindly accepting an LLM’s attempt at content creation is just as crazy as assuming that there’s no need to doublecheck math calculations if the result is outside of your expectations.

Copilot works for this analogy because copilots are there to help and to be a check against error. The old adage of “trust by verify” is absolutely the way they operate. No pilot would assume they were infallible and no copilot would assume everything the pilot said was right. Human intervention is still necessary in order to make sure that the output matches the desired result. The biggest difference today is that when it comes to AI art generation or content creation a failure to produce a desired result means wasted time. In a situation with an autopilot on an airline making a mistake in landing the results are more horrific.

People want to embrace AI to take away the drudgery of their jobs. It’s remarkably similar to how automation was going to take away our jobs before we realized it was really going to take away the boring, repetitive parts of what we do. Branding AI as “autopilot” will have negative consequences for adoption because people don’t like the idea of a computer or an algorithm doing everything for them. However, copilots are helpful and can take care of boring or menial tasks leaving you free to concentrate on the critical parts of your job. It’s not going to replace us as much as help us.


Tom’s Take

Terminology matters. Autopilot is cold and restrictive. Having a copilot sounds like an adventure. Companies are wise not to encourage the assumption that AI is going to take over jobs and eliminate workers. The key is that people should see the solution as offering a way to offload tasks and ask for help when needed. It’s a better outcome for the people doing the job as well as the algorithms that are learning along the way.

Human Generated Questions About AI Assistants

I’ve taken a number of briefings in the last few months that all mention how companies are starting to get into AI by building an AI virtual assistant. In theory this is the easiest entry point into the technology. Your network already has a ton of information about usage patterns and trouble spots. Network operations and engineering teams have learned over the years to read that information and provide analysis and feedback.

If marketing is to be believed, no one in the modern world has time to learn how to read all that data. Instead, AI provides a natural language way to ask simple questions and have the system provide the data back to you with proper context. It will highlight areas of concern and help you grasp what’s going on. Only you don’t need to get a CCNA to get there. Or, more likely, it’s more useful for someone on the executive team to ask questions and get answers without the need to talk to the network team.

I have some questions that I always like to ask when companies start telling me about their new AI assistant that help me understand how it’s being built.

Question 1: Laying Out LLMs

My first question is always:

Which LLM are you using to power your system?

The reason is because there are only two real options. You’re either paying someone else to do it as a service, like OpenAI, or you’re pulling down your own large language model (LLM) and building your own system. Both have advantages and disadvantages.

The advantage of a service-based offering is that you don’t need to program anything. You just feed the data to the LLM and it takes off. No tuning needed. It’s fast and universally available.

The downside of a service based model is the fact that it costs money. And if you’re using it commercially it’s going to cost more than a simple monthly fee. The more you use it, the more expensive it gets. If your vendor is pulling thousands of daily requests from the LLM is that factored into the fee they’re charging you? What happens when the OpenAI prices go up?

The advantages of building your own system are that you have complete control over the way the data is being processed. You tune the LLM and you own the way it’s being used. No need to pay more to someone else to do all the work for you. You can also decide how and when features are implemented. If you’re updating the LLM on your schedule you can include new features when they’re ready and not when OpenAI pushes them live and makes them available for everyone.

The disadvantages of building your own system involves maintenance. You have to update and patch it. You have to figure out what features to develop. You have to put in the work. And if the model you use goes out of support or is no longer being maintained you have to swap to something new and hope that all your functions are going to work with the new one.

Question 2: Data Sources

My second question:

Where does the LLM data come from?

May seem simple at first, right? You’re training your LLM on your data so it gives you answers based on your environment. You’d want that to be the case so it’s more likely to tell you things about your network. But that insight doesn’t come out of thin air. If you want to feed your data to the LLM to get answers you’re going to have to wait while it studies the network and comes up with conclusions.

I often ask companies if they’re populating the system with anonymized data from other companies to provide baselines. I’ve seen this before from companies like Nyansa, which was bought by VMware, and Raza Networks, while is part of HPE Aruba. Both of those companies, which came out long before the current AI craze, collected data from customers and used it to build baselines for everyone. If you wanted to see how you compared to other high education or medical verticals the system could tell you what those types of environments looked like, with the names obscured of course.

Pre-populating the LLM with information from other companies is great if your stakeholders want to know how they fare against other companies. But it also runs the risk of populating data that shouldn’t be in the system. That could create situations where you’re acting on bad information or chasing phantoms in the organization. Worse yet, your own data could be used in ways you didn’t intend to feed other organizations. Even with the names obscured someone might be able to engineer a way to obtain knowledge about your environment you don’t want everyone to have.

Question 3: Are You Seeing That?

My third question:

How do you handle hallucinations?

Hallucination is the term for when the AI comes up with an answer that is false. That’s right, the super intelligent system just made up an answer instead of saying “I don’t know”. Which is great if you’re trying to convince someone you’re smart or useful. But if the entire reason why I’m using your service is accurate answers about my problems I’d rather have you say you don’t have an answer or you need to do research instead of giving me bad data that I use to make bad decisions.

If a company tells me they don’t really see hallucinations then I immediately get concerned, especially if they’re leveraging OpenAI for their LLM. I’ve talked before about how ChatGPT has a really bad habit of making up answers so it always looks like it knows everything. That’s great if you’re trying to get the system to write a term paper for you. It’s really bad if you try to reroute traffic in your network around a non-existent problem. I know there are many systems out there that can help reduce hallucinations, such as retrieval augmented generation (RAG), but I need that to be addressed up front instead of a simple “we don’t see hallucinations” because that makes me feel like something is being hidden or glossed over.


Tom’s Take

These aren’t the only questions you should be asking about AI and LLMs in your network but they’re not a bad start. They encompass the first big issues that people are likely to run into when evaluating an AI system. How do you do your analysis? What is happening with my data? What happens when the system doesn’t know what to do? Sure, there’s always going to be questions about cost and lock-in but I’d rather know the technology is sound before I ever try to deploy the system. You can always negotiate cost. You can’t negotiate with a flaw AI.

AI Is Making Data Cost Too Much

You may recall that I wrote a piece almost six years ago comparing big data to nuclear power. Part of the purpose of that piece was to knock the wind out of the “data is oil” comparisons that were so popular. Today’s landscape is totally different now thanks to the shifts that the IT industry has undergone in the past few years. I now believe that AI is going to cause a massive amount of wealth transfer away from the AI companies and cause startup economics to shift.

Can AI Really Work for Enterprises?

In this episode of Packet Pushers, Greg Ferro and Brad Casemore debate a lot of topics around the future of networking. One of the things that Brad brought up that Greg pointed out is that data being used for AI algorithm training is being stored in the cloud. That massive amount of data is sitting there waiting to be used between training runs and it’s costing some AI startups a fortune in cloud costs.

AI algorithms need to be trained to be useful. When someone uses ChatGPT to write a term paper or ask nonsensical questions you’re using the output of the GPT training run. The real work happens when OpenAI is crunching data and feeding their monster. They have to give it a set of parameters and data to analyze in order to come up with the magic that you see in the prompt window. That data doesn’t just come out of nowhere. It has to be compiled and analyzed.

There are a lot of creators of content that are angry that their words are being fed into the GPT algorithm runs and then being used in the results without giving proper credit. That means that OpenAI is scraping the content from the web and feeding it into the algorithm without care for what they’re looking it. It also creates issues where the validity and the accuracy of the data isn’t verified ahead of time.

Now, this focuses on OpenAI and GPT specifically because everyone seems to think that’s AI right now. Much like every solution in the history of IT, GPT-based large language models (LLMs) are just a stepping stone along the way to greater understanding of what AI can do. The real value for organizations, as Greg pointed out in the podcast, can be something as simple as analyzing the trouble ticket a user has submitted and then offering directed questions to help clarify the ticket for the help desk so they spend less time chasing false leads.

No Free Lunchboxes

Where are organizations going to store that data? In the old days it was going to be collected in on-prem storage arrays that weren’t being used for anything else. The opportunity cost of using something you already owned was minimal. After all, you bought the capacity so why not use it? Organizations that took this approach decided to just save every data point they could find in an effort to “mine” for insights later. Hence the references to oil and other natural resources.

Today’s world is different. LLMs need massive resources to run. Unless you’re willing to drop several million dollars to build out your own cluster resources and hire engineers to keep the running at peak performance you’re probably going to be using a hosted cloud solution. That’s easy enough to set up and run. And you’re only paying for what you use. CPU and GPU times are important so you want the job to complete as fast as possible in order to keep your costs low.

What about the data that you need to feed to the algorithm? Are you going to feed it from your on-prem storage? That’s way too slow, even with super fast WAN links. You need to get the data as close to the processors as possible. That means you need to migrate it into the cloud. You need to keep it there while the magic AI building machine does the work. Are you going to keep that valuable data in the cloud, incurring costs every hour it’s stored there? Or are you going to pay to have it moved back to your enterprise? Either way the sound of a cash register is deafening to your finance department and music to the ears of cloud providers and storage vendors selling them exabytes of data storage.

All those hopes of making tons of money from your AI insights are going to evaporate in a pile of cloud bills. The operations costs of keeping that data are now more than minimal. If you want to have good data to operate on you’re going to need to keep it. And if you can’t keep it locally in your organization you’re going to have to pay someone to keep it for you. That means writing big checks to the cloud providers that have effectively infinite storage, bounded only by the limit on your credit card or purchase order. That kind of wealth transfer makes investors seem a bit hesitant when they aren’t going to get the casino-like payouts they’d been hoping for.

The shift will cause AI startups to be very frugal in what they keep. They will either amass data only when they think their algorithm is ready for a run or keep only critical data that they know they’re going to need to feed the monster. That means they’re going to be playing a game with the accuracy of the resulting software as well as giving up chances that some insignificant piece of data ends up being the key to a huge shift. In essence, the software will all start looking and sounding the same after a while and there won’t be enough differentiation to make they competitive because no one will be able to afford it.


Tom’s Take

The relative ease with which data could be stored turned companies into data hoarders. They kept it forever hoping they could get some value out of it and create a return curve that soared to the moon. Instead, the application for that data mining finally came along and everyone realized that getting the value out of the data meant investing even more capital into refining it. That kind of investment makes those curves much flatter and makes investors more reluctant. That kind of shift means more work and less astronomical payout. All because your resources were more costly than you first thought.

Using AI for Attack Attribution

While I was hanging out at Cisco Live last week, I had a fun conversation with someone about the use of AI in security. We’ve seen a lot of companies jump in to add AI-enabled services to their platforms and offerings. I’m not going to spend time debating the merits of it or trying to argue for AI versus machine learning (ML). What I do want to talk about is something that I feel might be a little overlooked when it comes to using AI in security research.

Whodunnit?

After a big breach notification or a report that something has been exposed there are two separate races that start. The most visible is the one to patch the exploit and contain the damage. Figure out what’s broken and fix it so there’s no more threat of attack. The other race involves figuring out who is responsible for causing the issue.

Attribution is something that security researchers value highly in the post-mortem of an attack. If the attack is the first of its kind the researchers want to know who caused it. They want to see if the attackers are someone new on the scene that have developed new tools and skills or if it is an existing person or group that has expanded their target list or repertoire. If you think of a more traditional definition of crime from legal dramas and police procedurals you are wondering if this is a one-off crime or if this is a group expanding their reach.

Attribution requires analysis. You need to look for the digital fingerprints of a group in the attack patterns. Did they favor a particular entry point? Are they looking for the same kinds of accounts to do privilege escalation? Did they deface the web servers with the same digital graffiti? For attackers looking to make a name for themselves, attribution is pretty easy to figure out. They want to make a splash. However, for state-sponsored crews or organizations looking to keep a low profile it is much more likely they’re going to obfuscate their methods to avoid detection as long as possible. They might even throw out a few red herrings to make people attribute the attack to a different group.

Picking Out Patterns

If the methodology of doing attribution requires pattern matching and research, why not use AI to assist? We already use AI and ML to help us detect the breaches. Why not apply it to figuring out who is doing the breaching? We already know that AI can help us identify people based on a variety of characteristics. Just look up any kind of market research done by advertising agencies and you can see how scary they can predict buyer behavior based on all kinds of pattern recognition.

Let’s apply that same methodology to attack attribution. AI and ML are great at not only sifting through the noise when it comes to pattern recognition but they can also build a profile of the patterns to confirm those suspicions. Imagine profiling an attacker by seeing that they use one or two methods for gaining entry, such as spearphishing, to gain access to start privilege escalation. They always go after the same service accounts and move laterally to the same servers after gaining it. This is all great information for predicting attacks and stopping them. But it’s super valuable for tracking down who is doing it.

Assuming that crews bring new attackers on board frequently to keep their crime pipeline full you can also see how much of the attack profile is innate talent versus training. One could assume that these organizations aren’t terribly different from your average IT shop when it comes to training. It’s just the result of that training that differs. If you start seeing a large influx of attacks that use repetition of similar techniques from different locations it could be assumed that there is some kind of training going on somewhere in the loop.

The other thing that provides value is determining when someone is trying to masquerade as a different group using techniques to obfuscate or misattribute breaches. Building a profile of an attacker means you know how long it takes them to move to new targets or how likely they are to take certain actions within a specific window. If you work out the details of an attack you can see quickly if someone is following a script or if they’re doing something in a specific way to make it look like someone else is trying to get in. This especially applies at the level of nation-state sponsored groups, since creating doubt in the attribution can prevent your detection or even cause diplomatic sanctions against the wrong country.

Of course, the real challenges is that AI and ML aren’t foolproof. They aren’t the ultimate arbiter of attack recognition and attribution. Instead, they are tools that should be introduced into the kit to help speed identification and provide assurances that you’ve got the right group before you publicize what you’ve found.


Tom’s Take

There’s a good chance that some security companies out there are already looking at or using AI to do attribution. I think it’s important to broaden our toolkits and use of models in all areas of cybersecurity. It also provides a baseline for creating normalized investigation. There have been too many cases where a researcher has rushed to pin attribution on a given group only to find out it wasn’t them at all. Using tools to confirm your suspicions not only reduces the likelihood you will name the wrong attacker but it also reduces the need to publicize quickly to claim credit for the identification. This should be about protection, no publicity.

ChatGPT and Creating For Yourself

I’m sure you’ve been inundated by posts about ChatGPT over the past couple of weeks. If you managed to avoid it the short version is that there is a new model from OpenAI that can write articles, create poetry, and basically answer your homework. Lots of people are testing it out for things as mundane as writing Amazon reviews or creating configurations for routers.

It’s not a universal hit though. Stack Overflow banned ChatGPT code answers because they’re almost always wrong. My own limited tests show that it can create a lot of words from a prompt that seem to sound correct but feel hollow. Many others have accused the algorithm of scraping content from others on the Internet and sampling it into answers to make it sound accurate but not the best answer to the question.

Are we ready for AI to do our writing for us? Is the era of the novelist or technical writer finished? Should we just hang up our keyboards and call it a day?

Byte-Sized Content

When I was deciding what I wanted to do with my life after college I took the GMAT to see if I could get into grad school for an MBA. I scored well on the exam but not quite to the magical level to get a scholarship. However, one area that I did do surprising well in for myself was the essay writing section. I bought a prep book that had advice for the major sections but spent a lot of time with the writing portion because it was relatively new at the time and many people were having issues with how to write an essay. The real secret is that the essay was graded by a computer, so you just had to follow a formula to succeed:

  1. Write an opening paragraph covering what you’re going to say with three points of discussion.
  2. Write a paragraph about point 1 and provide details to support it.
  3. Report for points 2 & 3
  4. Write a summary paragraph restating what you said in the opening.

That’s it. That’s the formula to win the GMAT writing portion. The computer isn’t looking for insightful poetry or groundbreaking sci-fi world building. It’s been trained to look for structure. Main idea statements, supporting evidence, and conclusions all tick boxes that provide points to pass the section.

If all that sounds terribly boring and formulaic you’re absolutely right. Passing a test of competence isn’t the same and pushing the boundaries of the craft. A poet like e e cummings would have failed because his work has no structure and contains capitalization errors compared to the standards of grammar. Yet no one would deny that he is a master of his craft. Likewise, always following the standards is only important when you want to create things that already exist.

Free Thinking

Tech writing is structured but often involves new ideas that aren’t commonplace. How can you train an algorithm to write about Zero Trust Network Architecture or VR surgery if no examples of that exist yet? Can you successfully tell ChatGPT to write about space exploration through augmented reality if no one has built it yet? Even if you asked would you know what sounded correct from the reply.

Part of the issue comes from content consumption. We read things and assume they are correct. Words were written so they must have been researched and confirmed before being committed to the screen. Therefore we tend to read content in a passive form. We’re not reacting to what we’re seeing but instead internalizing it for future use. That’s fine if we’re reading for fun or not thinking critically about a subject. But for technical skills it is imperative that we’re constantly challenging what’s written to ensure that it’s accurate and useful.

If we only consumed content passively we’d never explore new ideas or create new ways to achieve outcomes. Likewise, if the only content we have is created by algorithm based on existing training and thought patterns we will never evolve past the point we are today. We can’t hope that a machine will have the insight to look beyond the limitations imposed upon it by the bounds of the program. I talked about this over six years ago where I said that machine learning would always give you great answers but true AI would be able to find them where they don’t exist.

That’s my real issue with ChatGPT. It’s great at producing content that is well within the standard deviation of what is expected. It can find answers. It can’t create them. If you ask it how to enter lunar orbit it can tell you. But if you ask it how to create a spacecraft to get to a moon in a different star system it’s going to be stumped. Because that hasn’t been created yet. It can only tell you what it’s seen. We won’t evolve as a species unless we remember that our machines are only as good as the programming we impart to them.


Tom’s Take

ChatGPT and programs like Stable Diffusion are fun. They show how far our technology has come. But they also illustrate the importance that we as creative beings can still have. Programs can only create within their bounds. Real intelligence can break out of the mold and go places that machines can’t dream of. We’ve spent billions of dollars and millions of hours trying to train software to think like a human and we’ve barely scratched the surface. What we need to realize is that while we can write software that can approximate how a human can think we can never replace the ability to create something from nothing.