Editor’s note: This article is the seventh in a series, “Full-Spectrum: Capabilities and Authorities in Cyber and the Information Environment.” The series endeavors to present expert commentary on diverse issues surrounding US competition with peer and near-peer competitors in the cyber and information spaces. Read all articles in the series here.
Special thanks to series editors Capt. Maggie Smith, PhD of the Army Cyber Institute and MWI fellow Dr. Barnett S. Koven.
Every day, military aircraft move people and parts around the globe. As part of United States Transportation Command (TRANSCOM), planes are used to transport resupply to the European theater, insert special operations forces into Iraq and Syria, and—if necessary—transport the nation’s nuclear weapons in its hour of need. In an average week, USTRANSCOM completes more than 1,900 air missions, supporting everything from humanitarian relief operations to personal property shipments. But when it comes to a shipment of spare parts—or a routine trip to a conference—using military aircraft is both impractical and expensive compared to commercial air solutions. In the case of transport, the Department of Defense relies on a reasonable mix of government and commercial solutions. Similarly, when it comes to cloud computing, a reasonable mix is also in order: certain use cases are better suited to commercial clouds, like Amazon Web Services (AWS), however, government-owned clouds can often get the job done inexpensively and without cost overruns.
To win in the information environment, the Department of Defense must be able to access and analyze data and transform disparate pieces of data into intelligence. DoD needs to glean insights from data on demand, at the same or lower cost than at present. Cloud services promise to deliver on all demands, especially cost—a 2019 GAO report, for example, estimates that cloud computing has saved government agencies $291 million. But not all clouds have the same capabilities. The 2010 federal Cloud First mandate to migrate to the cloud has pushed the military and intelligence agencies to consider commercial clouds almost exclusively. Although the policy was updated to “Cloud Smart” in 2018, many leaders still view commercial clouds as the primary solution for their data storage and computing needs.
In this new era of great power competition, artificial intelligence will play a far greater role in conflict than conventional weaponry. If America is to succeed, it must have better access to data, information sharing, and the ability to eliminate redundant tasks—all of which depend on cloud technologies. However, solutions need not, and in many cases should not, rely solely on commercial cloud services. Indeed, DoD should avoid overdependence on commercial cloud providers for at least three reasons: access to data, the cost of high-powered computing, and the cost of massive data storage.
Access to Data
Regarding access to data, the commercial cloud provider, AWS, went through an extensive vetting process before the cloud service earned the Pentagon’s trust and became certified to store secret and top secret information. However, despite the vetting process, it is not certain that AWS’s interests—as a private company—will always align with those of the US government. For example, what if a corporation or its employees disagree with a US government action or policy? Of course, contracts between private and public entities provide legal protections to prevent adverse actions from being taken in the heat of conflict, but contractual obligations can be broken and even a temporary interruption can have profound consequences. How does DoD conduct military operations if powerful, private entities are able to stop data flow? Ultimately, who prevails when military imperatives conflict with corporate and private sector values? Given the recent case of corporate outrage and position-taking over Georgia’s voting law, conflict between public and private entities is an important consideration when opting for commercial cloud services.
The High Cost of Cloud Computing
Commercial clouds have numerous advantages: easy-to-navigate dashboards, clear optics into data usage, and a common set of tools within a company’s cloud ecosystem that exceed the capabilities previously available through the Defense Intelligence Systems Agency (DISA). The intelligence community (IC) understood DISA’s limitations early on and its contract with Amazon’s Commercial Cloud Solutions (C2S) was a forerunner in government adoption of the cloud. In fact, the IC worked quickly to decommission its own, in-house data centers, causing a forced migration into the commercial cloud.
As the IC’s data analysis processes evolved, user numbers surged, powerful algorithms were applied, and C2S expenses increased. Despite attempts to rein in the costs of commercial clouds, the IC soon realized (just as many in industry have) that performing all data storage and computing in commercial clouds is unwise and unsustainable.
Despite efforts to constrain costs, data ingestion and computing needs grow exponentially while cost-saving measures only increase linearly, recouping a fraction of a penny per gigabit per second at a time. A recent study by Synergy Research Group shows that “enterprise spending on cloud infrastructure services continued to ramp up aggressively in 2020, growing by 35 percent to reach almost $130 billion. Meanwhile enterprise spending on data center hardware and software dropped by 6 percent to under $90 billion.” Essentially, despite the talk about migrating to the cloud to save money, overall information technology expenditures have increased from $180 billion in 2019 to $210 billion in 2020.
Further, some cost reduction measures do not fully meet the needs of the IC or DoD. For example, to keep all data available for immediate processing and analysis requires data to be kept in what is called “hot storage”—always ready and accessible on demand. Conversely, in “cold storage,” data is required to be prepped before use. As data ages and the immediacy of its demand declines, data is slowly transferred into various stages of warm to cold storage. The coldest level of storage at AWS is known as the Glacier Deep Archive and refers to data that is often stored on magnetic tape. It can take up to forty-eight hours for AWS to “thaw” Glacier Deep stored data. For archival information, like compliance records, the process of data thawing is inconsequential. Indeed, Glacier storage may be safer and is more cost-effective than storing archived records and information on local servers. Additionally, when access to archived records is required, as in an audit, a company has plenty of time to thaw the Glacier Deep–stored records before the information is needed for inspection.
However, the IC and DoD are more likely to need quick and consistent access to data. For example, both intelligence agencies and military units are often presented with targets of opportunity. Sometimes, a six-month target history is required on demand, making quick and detailed analysis necessary to provide intelligence products before informed, strategic decisions can be made. Often, this process must be completed within hours or the target may slip away. Rapid analysis is not possible when data is in cold storage and needs to be thawed before it can be ingested and processed into an intelligence product. The result has been that more and more agencies are delaying the transfer of data to lower-cost storage for fear of not having immediate access to their data. The time lag for data recovery is another way that the advertised cost savings of moving to a commercial cloud fail to meet mission needs.
The Cost of Massive Data Storage
So how did we get to this point? Historically, all US government data storage and processing was done in countless local, regional, and national IT centers controlled by the government. A drawback to local storage and processing was that IT was often seen as an up-front cost rather than a reoccurring investment, leading IT managers to over-buy equipment for fear that they would not be able to obtain upgrades for several years. As a result, managers purchased excess capabilities of current technologies, further exacerbating the IT costs with years of excess power consumption and staffing resources to maintain the equipment. Often, by the time the organization needed the extra processing and storage capacity, the pre-purchased technology was obsolete.
This approach also created thousands of small data stovepipes. Since data was managed locally and was not tied into a central repository, the data was left isolated, unsearchable, nonsharable, and not formatted for rapid ingestion by analysis platforms. The data and computing nodes consisted of everything from warehouse-sized data centers filled with rows and rows of humming servers to local IT offices operating a half-rack of servers in a hall closet to underpin the local area network of a single building or operating location. Maintaining the localized systems typically required exceptionally skilled IT professionals to be available around the clock. Staffing requirements added steep costs in exchange for what was often a meager amount of storage and computing power, especially when compared to the scale of storage and processing available through commercial cloud services. To capitalize on the gains embraced by the commercial world by moving to the cloud, the government adopted policies to move away from its large, decentralized IT centers to large commercial clouds, with their data control, shared resources, and scalability.
One regularly cited reason for moving DoD’s data and processing to a single, large cloud provider was the idea that housing all data in one location would facilitate data sharing. The effort culminated in the ill-fated Joint Enterprise Defense Infrastructure, or JEDI, contract. If awarded, JEDI would authorize up to $10 billion over ten years for commercial cloud services. However, having all data and processing co-located in one cloud does not automatically solve DoD’s data-sharing woes. As data is liberated, it still requires testing and integration to make it usable. Therefore, to solve DoD’s data problem, policies and practices are also required to make the data knowable, shareable, and usable across systems.
In an effort to solve some of the data-sharing and transferability problems, the Joint Artificial Intelligence Center, or JAIC, released a solicitation for data readiness for artificial intelligence development services that aims “to help the DoD and Government users prepare data for use in AI applications by providing an easily accessible path to access the cutting-edge commercial services needed to meet the complex technical challenges involved in preparing data for AI.” Once awarded, the contract will do infinitely more AI-ready work than any commercial cloud provider could.
Additionally, certain DoD and government projects need to be kept in government clouds for security purposes. For example, there are highly secure, private government clouds where new software can be tested and used to conduct limited operations. Secure government clouds have unique authorities unavailable in a commercial cloud. As such, they fill a critical capability gap by allowing developmental software and those of Small Business Innovation Research (SBIR) grant winners to quickly prove their mettle. Often, SBIR awardees have twelve months to demonstrate their value, and if six months or more is spent getting the software approved to be loaded, the vendor is left scrambling to succeed in an even shorter period. With specialized government clouds, the capabilities of SBIR products can be rapidly evaluated in a controlled environment. Applications can be processed and approved for general use more rapidly, thereby clearing the current backlog of potential contenders. One standout project supporting SBIR work is TITAN Hybrid IT.
TITAN (short for Technology for Innovation and Testing on Accredited Networks) is a collaboration between the Air Force intelligence directorate, the National Reconnaissance Office (NRO), and the National Geospatial-Intelligence Agency (NGA). TITAN Hybrid IT provides IT-as-a-service, or ITaaS, to dozens of customers. Notably, some of the most well-known DoD AI programs rely on TITAN’s unique placement, access, and infrastructure. Indeed, programs from the under secretary of defense for intelligence and security, the under secretary of defense for research and engineering, the sponsors from the Air Force, NRO, NGA, and numerous other entities rely on TITAN services. Yet, despite being a nimble and cost-effective service, TITAN’s future remains uncertain given the perceived notion that cloud migration necessitates the use of commercial cloud providers and an arcane budgetary process.
The Way Ahead
In the eight years since the initial Amazon C2S contract was awarded, the IC has slowly realized that commercial clouds are not the solution for all data storage and computing needs. The IC has begun reinvigorating its own data centers, in part to recoup savings on its exorbitant needs for hot storage. Additionally, many of the processes that enable algorithmic development crucial to expanding AI and machine learning capabilities are processing power intensive, causing commercial cloud service costs to escalate. Much like the hot storage debacle, high-performance computing is costly. A more effective solution is to have government-owned and contractor-operated storage and computing housed in government-owned facilities for the most sensitive data, the most resource-intensive data, and the baseline algorithmic churn. The end product of this evolution is a hybrid, multi-cloud ecosystem, combining the best attributes of government-owned and contractor-operated with commercial clouds.
Additionally, baseline DoD cloud services should be owned and controlled by DoD. For moments of surge when indigenous systems do not provide enough storage and processing power, augmenting the system with commercial clouds is both good tradecraft and a fiscally responsible solution. A hybrid ecosystem also allows DoD to benefit from state-of-the-art commercial tools when the mission dictates, while simultaneously enabling the department to rely on its own capabilities for more standard tasks. Using commercial clouds for all DoD storage and computing needs is akin to using a rental car for all of one’s transportation needs. Any baseline service should have an easy, readily available surge capability that can scale into commercial clouds as needed and be able to do so seamlessly. The private sector uses a hybrid approach to data storage and computing and hybrid clouds are the models being implemented by the IC following its multi-year campaign to migrate everything to commercial clouds. DoD should follow suit, and establish a balance between localized, government-owned clouds and commercial cloud offerings to take advantage of best-in-breed capabilities, regardless of cloud ownership. This is what the services’ cloud strategies dictate, but not what is often espoused or executed at the operational level.
In conclusion, just as we do not divest our fleet of mobility aircraft because there are commercial carriers for people and packages, we should not completely divest our government-owned clouds just because there are now commercial providers. Each has an important role in ensuring our national security.
Daniel Jasper is the senior vice president of research and innovation at Cyber Safari, LLC. He recently retired from the US Air Force after twenty-four years as an intelligence officer and avionics technician.
The views expressed are those of the author and do not reflect the official position of the United States Military Academy, Department of the Army, Department of Defense, US government, or that of any organization with which the authors are affiliated, including Cyber Safari, LLC.
Image credit: Senior Airman Franklin R. Ramos, US Air Force
A well-reasoned argument that makes sense overall.
However, I'd be happy just to have a shared drive with some logical organization that didn't consist of hundreds of folders with key documents buried multiple levels deep in cryptically named folders that are reliant on tribal knowledge to find. Knowledge management is one of those very mundane topics that when handled well makes it easy to find information and is almost invisible to the end user. But when handled poorly leads to massive amounts of wasted time recreating information/documents that may already exist but can't be found, or just finding the right thing in the first place. Alas, without a solid file plan, naming conventions and enforcement you quickly end up with something named "Enclosure_2 – Document (3) – Copy – Copy – 1MAR – Use this version – New".