Operator X: An Intern Experience
07:30:2024
BY Matthew Hambrecht and Sala McElroy
SealingTech’s exciting new innovation Operator X is a chat interface built to assist cyber operators by bridging knowledge gaps via the use of cutting-edge generative AI tools and techniques. It leverages the existing knowledge base of a pre-trained large language model (LLM) combined with a retrieval augmented generation (RAG) architecture to allow operators to expand the LLM’s knowledge without requiring resource-exhaustive training. This means that operators can upload documents regarding their current task and converse with an LLM to quickly extract useful information. The abilities of Operator X are further expanded by modular agent deployments which allow operators to interact with their environment through domain-specific LLM modules trained to leverage cybersecurity tools based on the operator’s requested action.
As part of SealingTech’s CASTLE intern program this summer, Matt Hambrecht and Sala McElroy had the privilege of being a part of the team helping to develop Operator X. Matt is a rising senior at the University of Maryland, Baltimore County (UMBC) and is working toward his Bachelor of Science in Computer Science on the artificial intelligence (AI) and machine learning (ML) tract. College graduate Sala McElroy holds a BS in Computer Science from Augusta University.
Research and testing open-source LLMs
A goal for Sala while working on her team’s project involved creating an LLM agent that could help operators use Nmap, a network scanning tool. This agent processes queries for example, “What ports are open on this address?” It then performs the appropriate Nmap scan and returns the results to the operator. To achieve this, she focused on getting LLMs to understand and write code, ultimately using synthetic dataset generation and a finetuning process using low rank adaptation (LoRA) to train a model specialized in executing Nmap commands in Bash. Additionally, Sala spent time researching ways to enhance the RAG component of the project, enabling Operator X to expand its knowledge base with external information.
Improving model accuracy
Matt spent his time researching dataset expansion to improve model knowledge and robustness for the agent modules. He helped build tools that combine data augmentation and rephrasing to expand datasets by adding new queries and solutions to datasets. He examined how changing the way queries are worded and adding noise may help to improve the ability of models to understand and act accordingly regardless of how operators request specific actions.
Matt’s currently constructing a classification model finetuned to interpret an operator’s intent given a query and select the corresponding agent to complete the task. This will provide a more seamless experience for operators as it removes the need to directly select which tool will allow for a more natural chat experience.
Like Sala, Matt has put time into researching improvements to the RAG component of Operator X through the testing of newer tools and techniques. Beyond the Generative-AI side of things, he also helped design and build the demo along with his team for TechNet Cyber Baltimore this past June, as well as conduct security analysis, bug hunting, and improvements to the API and interface.
For their next steps on the project, Matt and Sala plan to continue improving the LLM features of the application and create more specialized agents for Operator X. They look forward to showing an enhanced version at 2024 TechNet Augusta Conference and Expo next month!
Are you or someone you know looking to gain practical, real-world experience in the exciting and ever-evolving landscape of cybersecurity and engineering? Contact us about our paid internship program at info@sealingtech.com.
Related Articles
SealingTech Abroad: 2024 International Travel Recap
As a cybersecurity technology and solutions-driven company, we’re committed to maintaining a combined front and shared capability with our international partners. Every year, our teams attend global conferences to meet…
Unsupervised Learning for Cybersecurity
Dashboards and automated alerts remain well-established fundamental components of nearly every cybersecurity team’s toolbelt. Peel back the layers of a network monitoring tool suite, and you’ll discover that every team…
My Experience at the National Homeland Security Conference
Each year, the U.S. Department of Homeland Security (DHS) holds its annual 3-day national conference to discuss emerging security trends and preview the latest technology and solutions working to support…
Could your news use a jolt?
Find out what’s happening across the cyber landscape every month with The Lightning Report.
Be privy to the latest trends and evolutions, along with strategies to safeguard your government agency or enterprise from cyber threats. Subscribe now.