Job System and Burst — Parallelism and SIMD
IJob, IJobParallelFor, the Burst compiler, NativeArray — a working path to high performance.
The main C# thread in Unity is a single thread. If Update() computes a path for a thousand NPCs, the
FPS will drop. Job System + Burst is the official way to parallelize computation across all CPU cores
and get SIMD optimization through a special AOT compiler.
What it is
- Job System — a package (
com.unity.jobs; in Unity 6 it’s core). An API for describing parallel tasks throughIJob/IJobParallelForstructs. - Burst — a package (
com.unity.burst). A high-performance AOT compiler: it takes your code and turns it into SIMD-optimized native code (via LLVM). Usually a ×5–×100 speedup. - Unity.Collections —
NativeArray<T>,NativeList<T>, etc. — GC-free structures that are passed between a Job and the main thread. - Unity.Mathematics —
float3,quaternion,math.*— types on which Burst works optimally (instead of UnityEngine.Vector3/Quaternion).
Web Workers + SharedArrayBuffer + WASM SIMD. The same 3 elements: parallelism + cross-thread data + low-level optimization.
Job System is threads. Burst is compilation to SIMD. NativeArray is shared memory without race conditions thanks to the safety system.
A basic IJob
The task is to compute result = sum(a × b) for two large arrays:
using Unity.Burst;
using Unity.Collections;
using Unity.Jobs;
using UnityEngine;
[BurstCompile]
public struct DotProductJob : IJob
{
[ReadOnly] public NativeArray<float> A;
[ReadOnly] public NativeArray<float> B;
public NativeArray<float> Result; // [0] — the result
public void Execute() {
float sum = 0f;
for (int i = 0; i < A.Length; i++) {
sum += A[i] * B[i];
}
Result[0] = sum;
}
}
public class JobUser : MonoBehaviour
{
private void Start() {
var a = new NativeArray<float>(10_000_000, Allocator.TempJob);
var b = new NativeArray<float>(10_000_000, Allocator.TempJob);
var result = new NativeArray<float>(1, Allocator.TempJob);
// ... fill a, b with values ...
var job = new DotProductJob { A = a, B = b, Result = result };
JobHandle handle = job.Schedule();
handle.Complete(); // block the main thread until the job finishes
Debug.Log($"Dot product: {result[0]}");
a.Dispose();
b.Dispose();
result.Dispose();
}
}
What Burst does:
- The
[BurstCompile]attribute tells the compiler to “take this struct and generate optimized native code”. - The
forloop unrolls into SIMD (4 or 8 floats per instruction). - Without Burst — plain C# IL, about 10× slower.
Allocator.TempJob — for data that lives up to 4 frames (a job usually finishes earlier).
Allocator.Persistent — if the NativeArray outlives many jobs (you free it manually
via .Dispose()).
Allocator.Temp — for very short-lived allocations (1 frame).
Safety: Unity complains about a leak if you forget .Dispose().
IJobParallelFor — splitting work across cores
If the task is “compute something for each element independently”, use IJobParallelFor:
[BurstCompile]
public struct MoveBoidsJob : IJobParallelFor
{
public NativeArray<float3> Positions;
[ReadOnly] public NativeArray<float3> Velocities;
public float DeltaTime;
public void Execute(int i) {
Positions[i] += Velocities[i] * DeltaTime;
}
}
public class BoidsManager : MonoBehaviour
{
private NativeArray<float3> _positions;
private NativeArray<float3> _velocities;
private const int Count = 100_000;
private void Start() {
_positions = new NativeArray<float3>(Count, Allocator.Persistent);
_velocities = new NativeArray<float3>(Count, Allocator.Persistent);
// ... initialization ...
}
private void Update() {
var job = new MoveBoidsJob {
Positions = _positions,
Velocities = _velocities,
DeltaTime = Time.deltaTime,
};
// innerloopBatchCount = how many iterations to give one worker thread at a time
JobHandle handle = job.Schedule(Count, 1024);
handle.Complete();
}
private void OnDestroy() {
_positions.Dispose();
_velocities.Dispose();
}
}
100 thousand boids are updated in ~0.5 ms on an 8-core CPU with Burst — versus ~50 ms for naive C#. This is no longer “optimization”, it’s a different class of performance.
Chains of jobs
Jobs can be linked through JobHandle:
var firstJob = new ComputeForcesJob { /* ... */ };
JobHandle firstHandle = firstJob.Schedule(count, 64);
var secondJob = new IntegrateVelocityJob { /* ... */ };
JobHandle secondHandle = secondJob.Schedule(count, 64, firstHandle);
// secondJob will wait for firstJob before starting
secondHandle.Complete(); // block the main thread at the end
This is exactly the foundation of the DOTS paradigm: many small jobs with dependencies, and the scheduler parallelizes them across a pool of worker threads by itself.
Safety system and common mistakes
The Unity Job System has a built-in safety system (in the Editor, not in release). It catches:
- Race condition — two jobs write to the same NativeArray at the same time.
- Read-after-write without a dependency — a job reads an array that another job writes to in parallel.
- Disposed array — an attempt to use a NativeArray after
.Dispose().
Mark fields the job only reads as [ReadOnly] — this gives the compiler more freedom for
parallelism.
Burst cannot call most of the UnityEngine API — Transform.position, GameObject.Find,
Debug.Log. Only pure functions from Unity.Mathematics, Unity.Collections, and plain C#. This is
the price of optimization: you move the logic into a “pure” computational world, then feed the
results back into Transforms on the main thread.
When it is worth using
- Heavy computation: pathfinding for 100+ agents, flock simulation, LOD calculation, batch processing of assets.
- Custom mesh generation: procedural landscapes, marching cubes, voxel terrain.
- AI batch decision-making: ECS-style, where you have 1000 NPCs and each one’s “what to do” is computed separately.
When it is NOT needed
- Little computation (10–100 iterations) — the Schedule overhead eats the gain.
- Logic tightly coupled to the UnityEngine API — rewriting everything onto NativeArray is expensive.
- A simple prototype — Burst isn’t needed as long as the FPS is fine.
Comparison with ECS / Entities
ECS (the Entities package) is the next step. There, data is stored in Native arrays (Chunks) from the start, and systems automatically work through Job + Burst. If you are writing a simulation-heavy project (RTS, sandbox, MMO), DOTS gives a structural advantage over the MonoBehaviour-Job combination. But the entry barrier is higher.