LogoLogo
Studio
4.5
4.5
  • Harper Docs
  • Getting Started
  • Developers
    • Applications
      • Caching
      • Defining Schemas
      • Defining Roles
      • Debugging Applications
      • Define Fastify Routes
      • Web Applications
      • Example Projects
    • Components
      • Managing
      • Reference
      • Built-In Components
    • REST
    • Operations API
      • Quick Start Examples
      • Databases and Tables
      • NoSQL Operations
      • Bulk Operations
      • Users and Roles
      • Clustering
        • Clustering with NATS
      • Custom Functions
      • Components
      • Registration
      • Jobs
      • Logs
      • Utilities
      • Token Authentication
      • SQL Operations
      • Advanced JSON SQL Examples
    • Real-Time
    • Replication/Clustering
      • Sharding
      • Legacy NATS Clustering
        • Requirements and Definitions
        • Creating A Cluster User
        • Naming A Node
        • Enabling Clustering
        • Establishing Routes
        • Subscription Overview
        • Managing Subscriptions
        • Things Worth Knowing
        • Certificate Management
    • Security
      • JWT Authentication
      • Basic Authentication
      • mTLS Authentication
      • Configuration
      • Users & Roles
      • Certificate Management
    • SQL Guide
      • SQL Features Matrix
      • SQL Date Functions
      • SQL Reserved Word
      • SQL Functions
      • SQL JSON Search
      • SQL Geospatial Functions
    • Miscellaneous
      • Google Data Studio
      • SDKs
      • Query Optimization
  • Administration
    • Best Practices and Recommendations
    • Logging
      • Standard Logging
      • Audit Logging
      • Transaction Logging
    • Clone Node
    • Compact
    • Jobs
    • Harper Studio
      • Create an Account
      • Log In & Password Reset
      • Organizations
      • Instances
      • Query Instance Data
      • Manage Databases / Browse Data
      • Manage Clustering
      • Manage Instance Users
      • Manage Instance Roles
      • Manage Applications
      • Instance Metrics
      • Instance Configuration
      • Enable Mixed Content
  • Deployments
    • Configuration File
    • Harper CLI
    • Install Harper
      • On Linux
    • Upgrade a Harper Instance
    • Harper Cloud
      • IOPS Impact on Performance
      • Instance Size Hardware Specs
      • Alarms
      • Verizon 5G Wavelength
  • Technical Details
    • Reference
      • Analytics
      • Architecture
      • Content Types
      • Data Types
      • Dynamic Schema
      • GraphQL
      • Harper Headers
      • Harper Limits
      • Globals
      • Resource Class
      • Transactions
      • Storage Algorithm
      • Blob
    • Release Notes
      • Harper Tucker (Version 4)
        • 4.5.8
        • 4.5.7
        • 4.5.6
        • 4.5.5
        • 4.5.4
        • 4.5.3
        • 4.5.2
        • 4.5.1
        • 4.5.0
        • 4.4.24
        • 4.4.23
        • 4.4.22
        • 4.4.21
        • 4.4.20
        • 4.4.19
        • 4.4.18
        • 4.4.17
        • 4.4.16
        • 4.4.15
        • 4.4.14
        • 4.4.13
        • 4.4.12
        • 4.4.11
        • 4.4.10
        • 4.4.9
        • 4.4.8
        • 4.4.7
        • 4.4.6
        • 4.4.5
        • 4.4.4
        • 4.4.3
        • 4.4.2
        • 4.4.1
        • 4.4.0
        • 4.3.38
        • 4.3.37
        • 4.3.36
        • 4.3.35
        • 4.3.34
        • 4.3.33
        • 4.3.32
        • 4.3.31
        • 4.3.30
        • 4.3.29
        • 4.3.28
        • 4.3.27
        • 4.3.26
        • 4.3.25
        • 4.3.24
        • 4.3.23
        • 4.3.22
        • 4.3.21
        • 4.3.20
        • 4.3.19
        • 4.3.18
        • 4.3.17
        • 4.3.16
        • 4.3.15
        • 4.3.14
        • 4.3.13
        • 4.3.12
        • 4.3.11
        • 4.3.10
        • 4.3.9
        • 4.3.8
        • 4.3.7
        • 4.3.6
        • 4.3.5
        • 4.3.4
        • 4.3.3
        • 4.3.2
        • 4.3.1
        • 4.3.0
        • 4.2.8
        • 4.2.7
        • 4.2.6
        • 4.2.5
        • 4.2.4
        • 4.2.3
        • 4.2.2
        • 4.2.1
        • 4.2.0
        • 4.1.2
        • 4.1.1
        • 4.1.0
        • 4.0.7
        • 4.0.6
        • 4.0.5
        • 4.0.4
        • 4.0.3
        • 4.0.2
        • 4.0.1
        • 4.0.0
        • Tucker
      • HarperDB Monkey (Version 3)
        • 3.3.0
        • 3.2.1
        • 3.2.0
        • 3.1.5
        • 3.1.4
        • 3.1.3
        • 3.1.2
        • 3.1.1
        • 3.1.0
        • 3.0.0
      • HarperDB Penny (Version 2)
        • 2.3.1
        • 2.3.0
        • 2.2.3
        • 2.2.2
        • 2.2.0
        • 2.1.1
      • HarperDB Alby (Version 1)
        • 1.3.1
        • 1.3.0
        • 1.2.0
        • 1.1.0
  • More Help
    • Support
    • Slack
    • Contact Us
Powered by GitBook
On this page
  • Error Handling
  • Blob size
  1. Technical Details
  2. Reference

Blob

Blobs are binary large objects that can be used to store any type of unstructured/binary data and is designed for large content. Blobs support streaming and feature better performance for content larger than about 20KB. Blobs are built off the native JavaScript Blob type, and HarperDB extends the native Blob type for integrated storage with the database. To use blobs, you would generally want to declare a field as a Blob type in your schema:

type MyTable {
	id: Any! @primaryKey
	data: Blob
}

You can then create a blob which writes the binary data to disk, and can then be included (as a reference) in a record. For example, you can create a record with a blob like:

let blob = await createBlob(largeBuffer);
await MyTable.put({ id: 'my-record', data: blob });

The data attribute in this example is a blob reference, and can be used like any other attribute in the record, but it is stored separately, and the data must be accessed asynchronously. You can retrieve the blob data with the standard Blob methods:

let buffer = await blob.bytes();

If you are creating a resource method, you can return a Response object with a blob as the body:

export class MyEndpoint extends MyTable {
	async get() {
		return {
			status: 200,
			headers: {},
			body: this.data, // this.data is a blob
		});
	}
}

One of the important characteristics of blobs is they natively support asynchronous streaming of data. This is important for both creation and retrieval of large data. When we create a blob with createBlob, the returned blob will create the storage entry, but the data will be streamed to storage. This means that you can create a blob from a buffer or from a stream. You can also create a record that references a blob before the blob is fully written to storage. For example, you can create a blob from a stream:

let blob = await createBlob(stream);
// at this point the blob exists, but the data is still being written to storage
await MyTable.put({ id: 'my-record', data: blob });
// we now have written a record that references the blob
let record = await MyTable.get('my-record');
// we now have a record that gives us access to the blob. We can asynchronously access the blob's data or stream the data, and it will be available as blob the stream is written to the blob.
let stream = record.data.stream();

This can be powerful functionality for large media content, where content can be streamed into storage as it streamed out in real-time to users as it is received. Alternately, we can also wait for the blob to be fully written to storage before creating a record that references the blob:

let blob = await createBlob(stream);
// at this point the blob exists, but the data is was not been written to storage
await blob.save(MyTable);
// we now know the blob is fully written to storage
await MyTable.put({ id: 'my-record', data: blob });

Error Handling

Because blobs can be streamed and referenced prior to their completion, there is a chance that an error or interruption could occur while streaming data to the blob (after the record is committed). We can create an error handler for the blob to handle the case of an interrupted blob:

export class MyEndpoint extends MyTable {
	let blob = this.data;
	blob.on('error', () => {
		// if this was a caching table, we may want to invalidate or delete this record:
  		this.invalidate();
	});
	async get() {
		return {
			status: 200,
			headers: {},
			body: blob
		});
	}
}

Blob size

Blobs that are created from streams may not have the standard size property available, because the size may not be known while data is being streamed. Consequently, the size property may be undefined until the size is determined. You can listen for the size event to be notified when the size is available:

let record = await MyTable.get('my-record');
let blob = record.data;
blob.size // will be available if it was saved with a known size
let stream blob.stream(); // start streaming the data
if (blob.size === undefined) {
	blob.on('size', (size) => {
		// will be called once the size is available
	})
}
PreviousStorage AlgorithmNextRelease Notes

Last updated 16 minutes ago

Note that this means that blobs are not atomic or compliant; streaming functionality achieves the opposite behavior of ACID/atomic writes that would prevent access to data as it is being written.

See the documentation for more information on configuring where blob are stored.

ACID
configuration