Firmware SBOM Proposal

Richard Hughes (Red Hat) and Martin Fernandez (Eclypsium)
Wednesday, October 4, 2023

The UEFI SBoM Sub Team has been working on recommendations for what should be included in a firmware SBoM that conforms to the UEFI Standard, and how that data should be included in the firmware image. The purpose of this blog post is to present a possible solution, proposed by the authors, and to get feedback from a wider audience. Feedback may be addressed to the authors, and/or the chairs of the sub team Brian Mullen and Dick Wilkins.

Authors:

 

Introduction

A Software Bill of Materials (SBoM) acts as a formal document explaining what components are contained within a binary deliverable, and who is responsible for each part. Due to the increasing number of high-profile supply chain attacks, it has become more important to record information about critical software such as system and peripheral firmware. For US companies, Executive Order 14028, “Improving the Nation's Cybersecurity” now makes providing a SBoM a legal obligation for many companies.

For a long time, it has been difficult to build SBoM metadata for Tianocore/EDK2 due to the complexity of having source code produced by one entity, and it being compiled with other additional images by an ODM on behalf of an OEM. Firmware projects such as Coreboot now include SBoM generation as part of the monolithic image, and now firmware conforming to the UEFI standard also needs a SBoM that is both complete and accurate. But, as described in a recent whitepaper published by the UEFI Forum, there are many sources and types of code and data contained in a firmware image and obtaining proper SBoM metadata is difficult.

Although there are a multitude of formats for the distribution of the firmware SBoM to end-users, very few tools exist to actually embed SBoMs into binary blobs, or to extract components of an SBoM to build a composite firmware SBoM. The python-uswid project is one such tool, although other tools could be used instead. Conceptually, the value of the firmware SBoM is the quality of the information rather than the specific tools or formats used.

Firmware Components

There are, broadly speaking, 4 different kinds of components that are included in a firmware image, and each has a different recommendation on how to include the SBoM metadata.

Immutable Space-Critical Blobs

These are firmware blobs (not portable executable files) that are loaded onto space-constrained sub-component. A good example here is Intel CPU Microcode that is loaded into the processor itself. The microcode is both signed and encrypted, and no additional supplemental data is allowed. The solution here is to provide “detached metadata” from the same source (or the same zip archive) that is used to distribute the immutable blob. For instance, the silicon vendor could produce a zip file deliverable with:


microcode.zip

  • Intel-ucode-06-03-02.bin
  • Intel-ucode-06-03-02.json
  • Intel-ucode-06-03-02.json.asc

...where the contents of the JSON file would be enough information to identify the binary component, and also specify the legal entity that is responsible for the component. The format is not prescribed, but goSWID would be a good example:

[

  {
    "lang": "en-US",
     "tag-id": "bcbd84ff-9898-4922-8ade-dd4bbe2e40ba",
     "software-name": "MCU 06-03-02",
     "software-version": "20230808",
     "version-scheme": "decimal",
     "software-meta": [
       {
         "product": "CPU"
       }
     ],
    "hash": [
       {
         "alg_id": "SHA256",
         "value": "067cb…"
       }
     ],
     "entity": [
       {
         "entity-name": "Intel Corporation",
         "reg-id": "com.intel",
         "role": [
           "tagCreator",
           "softwareCreator"
         ]
       }
     ]
   }
 ]

Note: As the SBoM metadata is detached, care must be taken to ensure that the files do not get “out of sync” and are updated at the same time in the firmware source tree. Detached metadata should always contain the hash value of the binary to allow validation and can be signed using a detached signature if the archive is not already signed. The public key should ideally be distributed on a keyserver or company website for verification. For instance, to create the detached signature:

gpg --output Intel-ucode-06-03-02.json.asc --detach-sig Intel-ucode-06-03-02.json

Immutable Large Blobs

These are format agnostic firmware blobs (not portable executable files) that are either loaded from a disk, or from NVRAM and are not space critical. Examples here would be PCI OptionROM or AMD PSP where adding less than 1kB of additional data would not be a problem. The solution here is to embed the SBoM metadata into the component itself rather than providing a detached SBoM. This ensures that the SBoM for the component can never get out of sync and means the SBoM can be included when the blob is being built automatically, and before the blob is signed. The authors propose to use a 16 byte “magic header” so that software can easily read the embedded SBoM without prescribing a specific offset into the image, which may be impossible due to specific format considerations. This is called the uSWID format, which is a 24 byte header to a compressed coSWID SBoM. uSWID was chosen due to the small compiled size of CBOR-encoded coSWID data compared to SPDX and SWID.

For instance, a firmware file of unspecified and unknowable format from a silicon vendor should have an embedded uSWID section we can locate and consume:

$ xxd ec-firmware.bin
00000000: dead beef dead beef dead beef dead beef  ................
00000010: dead beef dead beef 5342 4f4d d6ba 2eac  ........SBOM....
00000020: a3e6 7a52 aaee 3baf 0117 0098 0000 00a9  ..zR..;.........
00000030: 0f65 656e 2d55 5300 5021 242f f8e2 c658  .een-US.P!$/...X
00000040: 01a4 f380 7acc 08a2 d208 f501 6d4d 6f64  ....z.......mMod
00000050: 656d 4261 7365 6261 6e64 0d68 3131 2e32  emBaseband.h11.2
00000060: 322e 3333 0e01 05a2 1832 6575 5357 4944  2.33.....2euSWID
00000070: 182d 7828 6232 6564 3666 3165 6438 3538  .-x(b2ed6f1ed858
00000080: 3762 6630 3161 3239 3531 6437 3435 3132  7bf01a2951d74512
00000090: 6137 3066 3161 3531 3264 3338 0281 a318  a70f1a512d38....
000000a0: 1f6f 4875 6768 736b 6920 4c69 6d69 7465  .oHughski Limite
000000b0: 6418 206b 6875 6768 736b 692e 636f 6d18  d. khughski.com.
000000c0: 2183 0104 0204 80de adbe efde adbe efde  !...............
000000d0: adbe efde adbe efde adbe efde adbe efde  ................

Note: In real world firmware this would be embedded as a zlib-compressed blob (e.g. using the -- compress flag in the uswid command line tool) – but here it is uncompressed to show the “magic” uSWID header and the plaintext SBoM data.

Precompiled Portable Executable (PE) Binaries

Sometimes ODM or OEM vendors do not compile all the modules in the volume from source code, and instead get pre-compiled and pre-signed binaries from 3rd party suppliers. Whilst we could ask the supplier for a detached SBoM like for the “Immutable Space-Critical Blob” case there is a significant risk that the binary and SBoM metadata are not kept up to date. The binary could also include the uSWID “magic header” in some empty space in the PE header or in one of the data sections like .rdata but that would require making compiler or source code changes to include the static data.

By including the coSWID data in a new .sbom COFF section, the need to scan for a magic header is avoided, and this data can be easily included much later, at link time. It is not necessary to use the magic header of uSWID as the PE header can be parsed for the correct offset of the section. For instance:

$ objdump -s -j .sbom fwupdx64.efi
fwupdx64.efi:     file format pei-x86-64
Contents of section .sbom:
 14000 bf0f6565 6e2d5553 0050b84e d8eda7b1  ..een-US.P.N....
 14010 502f83f6 90132e68 adef08f5 01686677  P/.....h.....hfw
 14020 75706478 36340d63 312e350e 19400005  updx64.c1.5..@..
 14030 bf183265 66777570 64183778 26454649  ..2efwupd.7x&EFI
 14040 2068656c 70657273 20746f20 696e7374   helpers to inst
 14050 616c6c20 73797374 656d2066 69726d77  all system firmw
 14060 61726518 2d6f312e 342d3139 2d673264  are.-o1.4-19-g2d
 14070 38636231 64ff029f bf181f6e 52696368  8cb1d......nRich
 14080 61726420 48756768 65731820 6b687567  ard Hughes. khug
 14090 68736965 2e636f6d 18219f06 01ffffff  hsie.com.!......
 140a0 049fbf18 26782768 74747073 3a2f2f73  ....&x'https://s
 140b0 7064782e 6f72672f 6c696365 6e736573  pdx.org/licenses
 140c0 2f4c4750 4c2d322e 302e6874 6d6c1828  /LGPL-2.0.html.(
 140d0 21ffffff                              !...

An additional benefit of including the SBoM in a COFF section is that it is verified by the existing Authenticode signature.

PE Binaries Built from Source

Most components in the image are compiled from source code and linked into PE binaries. Ideally the SBoM metadata would be automatically built and verified at compile time, and then either added to:

  • per-volume section (as a uSWID data section)
  • the PE binary as coSWID in the .sbom COFF section
  • The per-firmware global SBoM collection

Specifically to Tianocore/EDK2 firmware, there is an example source tree showing how we could supplement the information in the per-module .inf file with per-module, and per-volume overrides. More specific recommendations on how to link the coSWID into either the .sbom section or be collected into a toplevel SBoM have not made as this will be heavily influenced on the existing proprietary build system and source code tools used to build the image.

For vendors using either EDK2, or a superset of EDK2 there are some example incomplete patches that generate per-module SBoM data. These may be more useful than the artificial source tree given above.

Final Comments

Whilst a firmware SBoM can be constructed by deconstructing the firmware volumes, looking for coSWID sections in PE files and uSWID magic headers in 3rd party blobs some vendors may wish to construct a “defragmented” top-level firmware SBoM at build time. If this is done, it should be compressed to provide deduplication. Using uSWID for this would lead to an additional storage space requirement of ~70kB for 1000 typical components. In this case the PE binaries built from source do not need to have a .sbom section unless it is likely the components within the image are going to be analyzed in isolation. This increase in size seems very feasible for Tianocore/EDK2 firmware that will provide compliance and auditing inherent to the firmware itself.

To comply with Executive Order 14028 vendors should also publish the SWID XML as a download on the public device webpage. As a recommendation, the SHA256 checksum of the generated SWID XML file should be used as the unique collection ID for the composite SBoM. This would enable the SBoM to be found using a search engine even if the original OEM has been renamed or the device HTML URI has been modified.

It is expected that existing firmware analysis tools will read the SBoM metadata from firmware images and updates. When firmware is uploaded to the Linux Vendor Firmware Service it already extracts all available SBoM metadata and gives an immutable HTML page with SPID and CycloneDX download links that can be used for compliance purposes.

Whilst a complete and verified SBoM (including dependencies) of all components and subcomponents is the end goal we believe that best-effort SBoM is better than no SBoM and some of the recommendations here should be expanded in scope and depth at a later time. This initiative will need significant buy-in from affected ISVs, IBVs, ODMs and OEMs – but with these sets of recommendations we feel sure that the resulting firmware SBoM will be useful to security teams and consumers alike. This would greatly benefit the entire firmware ecosystem and make the global supply chain measurably safer.