Back
Documentation
Playground →

PlumbrC Documentation

Everything you need to detect and redact secrets from logs, streams, and text data.

CLI
Pipe or file mode
5M lines/sec
Python
pip install plumbrc
2M lines/sec (bulk)
REST API
POST /api/redact
~0.1ms server

Quick Start

Option 1: Native C Binary (fastest)

bash
# Install dependencies
sudo apt install build-essential libpcre2-dev

# Clone and build
git clone https://github.com/AmritRai1234/plumbrC.git
cd plumbrC
make -j$(nproc)

# Run
echo "api_key=AKIAIOSFODNN7EXAMPLE" | ./build/bin/plumbr
# Output: api_key=[REDACTED:aws_key]

Option 2: Python Package

bash
pip install plumbrc
python
from plumbrc import Plumbr

p = Plumbr()
result = p.redact("api_key=AKIAIOSFODNN7EXAMPLE")
print(result)  # api_key=[REDACTED:aws_key]

Option 3: REST API (for testing)

bash
curl -X POST https://plumbr.ca/api/redact \
  -H "Content-Type: application/json" \
  -d '{"text": "api_key=AKIAIOSFODNN7EXAMPLE"}'
Note: The REST API is for testing and prototyping. Network latency (100–800ms round-trip) is the bottleneck — the server processes each request in ~0.1ms. For production throughput, use the native C binary or Python package.

CLI Usage

The plumbr binary reads from stdin and writes clean output to stdout. Perfect for Unix pipelines.

Basic usage

bash
# Pipe mode — redact in real-time
tail -f /var/log/app.log | plumbr | tee safe.log

# File mode — redact an entire file
plumbr < app.log > redacted.log

# Inline test
echo "password=s3cret123 token=ghp_abc123xyz" | plumbr

Options

FlagDescriptionDefault
-p, --patterns FILELoad patterns from FILEbuilt-in defaults
-d, --defaultsUse built-in default patterns (14)on
-D, --no-defaultsDisable built-in defaults
-j, --threads NWorker threads (0 = auto-detect)0 (auto)
-q, --quietSuppress statistics outputoff
-s, --statsPrint statistics to stderron
-H, --hwinfoShow hardware detection info (SIMD, cores)
-v, --versionShow version
-h, --helpShow help

Examples

bash
# Use custom patterns only
plumbr -D -p my_patterns.txt < app.log > safe.log

# Use defaults + custom patterns
plumbr -p extra_patterns.txt < app.log > safe.log

# Load all 702 patterns
plumbr -p patterns/all.txt < app.log > safe.log

# Multi-threaded (8 workers)
plumbr -j 8 < huge.log > safe.log

# CI/CD pipeline — redact before shipping logs
docker logs my-app 2>&1 | plumbr | kubectl logs-shipper

# Hardware info
plumbr -H
# AVX2: yes, SSE 4.2: yes, Cores: 12, Threads: 24

C Library API

Embed PlumbrC directly into your C/C++ application via libplumbr. Available as both static (.a) and shared (.so) libraries.

Build the library

bash
# Static library (libplumbr.a)
make lib

# Shared library (libplumbr.so)
make shared

# Files created:
# build/lib/libplumbr.a    — static
# build/lib/libplumbr.so   — shared

Basic example

c
#include <libplumbr.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void) {
    // Create instance with default patterns
    libplumbr_t *p = libplumbr_new(NULL);
    if (!p) return 1;

    printf("Loaded %zu patterns\n", libplumbr_pattern_count(p));

    // Redact a single line
    const char *input = "AWS key: AKIAIOSFODNN7EXAMPLE";
    size_t out_len;
    char *safe = libplumbr_redact(p, input, strlen(input), &out_len);

    if (safe) {
        printf("Output: %s\n", safe);
        // Output: AWS key: [REDACTED:aws_key]
        free(safe);  // Caller owns the result
    }

    // Get statistics
    libplumbr_stats_t stats = libplumbr_get_stats(p);
    printf("Lines: %zu, Modified: %zu\n",
           stats.lines_processed, stats.lines_modified);

    libplumbr_free(p);
    return 0;
}

Compile & link

bash
# With static library
gcc -Iinclude my_app.c -Lbuild/lib -lplumbr -lpcre2-8 -lpthread -o my_app

# With shared library
gcc -Iinclude my_app.c -Lbuild/lib -lplumbr -lpcre2-8 -lpthread \
    -Wl,-rpath,./build/lib -o my_app

Configuration

c
libplumbr_config_t config = {
    .pattern_file = "/etc/plumbr/patterns/all.txt",
    .pattern_dir  = NULL,        // Or load all files from a directory
    .num_threads  = 0,           // 0 = auto-detect
    .quiet        = 1,           // Suppress stats output
};

libplumbr_t *p = libplumbr_new(&config);

In-place redaction (zero-copy)

c
// Redact without allocating new memory
char buffer[1024];
strcpy(buffer, "token=ghp_xxxxx");
size_t len = strlen(buffer);

int new_len = libplumbr_redact_inplace(p, buffer, len, sizeof(buffer));
if (new_len >= 0) {
    buffer[new_len] = '\0';
    printf("%s\n", buffer);
}

Batch processing

c
const char *inputs[] = {
    "key=AKIAIOSFODNN7EXAMPLE",
    "Normal log line",
    "email: user@example.com",
};
size_t input_lens[] = { 26, 15, 23 };
char *outputs[3];
size_t output_lens[3];

int count = libplumbr_redact_batch(p, inputs, input_lens,
                                   outputs, output_lens, 3);

for (int i = 0; i < count; i++) {
    printf("%s\n", outputs[i]);
    free(outputs[i]);
}

Full API reference

FunctionDescription
libplumbr_new(config)Create instance (NULL for defaults)
libplumbr_redact(p, input, len, &out_len)Redact a line → new string (caller frees)
libplumbr_redact_inplace(p, buf, len, cap)In-place redaction (zero-copy)
libplumbr_redact_batch(p, ins, lens, outs, olens, n)Batch process multiple lines
libplumbr_get_stats(p)Get processing statistics
libplumbr_reset_stats(p)Reset counters to zero
libplumbr_pattern_count(p)Number of loaded patterns
libplumbr_version()Version string
libplumbr_is_threadsafe()Returns 1 if thread-safe
libplumbr_free(p)Free all resources

Python Package

The plumbrc package wraps the C engine via ctypes — same speed, zero network overhead. 2M lines/sec (bulk API).

Install

bash
pip install plumbrc

Basic usage

python
from plumbrc import Plumbr

# Create a redactor (loads 14 default patterns)
p = Plumbr()
print(f"Loaded {p.pattern_count} patterns")

# Redact a single line
result = p.redact("AWS key: AKIAIOSFODNN7EXAMPLE")
print(result)  # AWS key: [REDACTED:aws_key]

# Redact multiple lines
lines = [
    "User login with api_key=sk-proj-abc123",
    "Normal log line with no secrets",
    "Email sent to user@example.com",
]
safe_lines = p.redact_lines(lines)
for line in safe_lines:
    print(line)

Context manager

python
from plumbrc import Plumbr

# Automatically cleans up C resources when done
with Plumbr() as p:
    result = p.redact("password=s3cret123")
    print(result)  # password=[REDACTED:password]

Custom patterns

python
from plumbrc import Plumbr

# Load patterns from a custom file
p = Plumbr(pattern_file="/path/to/patterns.txt")

# Or load from a directory of pattern files
p = Plumbr(pattern_dir="/etc/plumbr/patterns/")

Statistics

python
p = Plumbr()

# Process some text
for i in range(1000):
    p.redact(f"api_key=AKIAIOSFODNN7EXAMPLE line {i}")

# Check stats
stats = p.stats
print(f"Lines processed: {stats['lines_processed']}")
print(f"Lines modified:  {stats['lines_modified']}")
print(f"Patterns matched: {stats['patterns_matched']}")

# Get version
print(f"PlumbrC version: {Plumbr.version()}")

Process log files

python
from plumbrc import Plumbr

p = Plumbr()

# Redact an entire log file
with open("app.log") as f_in, open("safe.log", "w") as f_out:
    for line in f_in:
        f_out.write(p.redact(line))

print(f"Done. Stats: {p.stats}")

Error handling

python
from plumbrc import Plumbr, LibraryNotFoundError, RedactionError

try:
    p = Plumbr()
except LibraryNotFoundError:
    print("libplumbr.so not found — install PlumbrC first")

try:
    result = p.redact(some_text)
except RedactionError as e:
    print(f"Redaction failed: {e}")

REST API

Base URL: https://plumbr.ca/api

Honest note: The C engine processes each request in ~0.1ms. But network round-trip adds 100–800ms depending on your location. The REST API is best for testing and prototyping. For production throughput, use the native C binary or Python package locally.

POST /api/redact

Redact secrets from a text string. Handles multi-line input.

bash
curl -X POST https://plumbr.ca/api/redact \
  -H "Content-Type: application/json" \
  -d '{"text": "AWS key: AKIAIOSFODNN7EXAMPLE\nemail: user@example.com"}'
json
{
  "redacted": "AWS key: [REDACTED:aws_key]\nemail: [REDACTED:email]",
  "stats": {
    "lines_processed": 2,
    "lines_modified": 2,
    "patterns_matched": 2,
    "processing_time_ms": 0.066
  }
}

POST /api/redact/batch

Process multiple texts in a single request. More efficient for bulk operations.

bash
curl -X POST https://plumbr.ca/api/redact/batch \
  -H "Content-Type: application/json" \
  -d '{
    "texts": [
      "key=AKIAIOSFODNN7EXAMPLE",
      "Normal log line",
      "token=ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
    ]
  }'

GET /health

Health check and server statistics.

bash
curl https://plumbr.ca/health
json
{
  "status": "healthy",
  "version": "1.0.2",
  "server_version": "1.0.0",
  "uptime_seconds": 3600.0,
  "patterns_loaded": 14,
  "stats": {
    "requests_total": 1523,
    "requests_ok": 1520,
    "requests_error": 3,
    "bytes_processed": 524288,
    "avg_rps": 0.4
  }
}

Self-host the API server

bash
# Build the server
make server

# Run on custom port
./build/bin/plumbr-server --port 9090 --threads 4

# With all patterns
./build/bin/plumbr-server --pattern-file patterns/all.txt

# Server options:
#   --port PORT         Listen port (default: 8080)
#   --host ADDR         Bind address (default: 0.0.0.0)
#   --threads N         Worker threads (0=auto)
#   --pattern-dir DIR   Load patterns from directory
#   --pattern-file FILE Load patterns from file

Custom Patterns

PlumbrC ships with 702 patterns in 12 categories. You can also define your own.

Pattern file format

One pattern per line. Four pipe-delimited fields:

text
name|literal|regex|replacement

# Fields:
#   name        — Human-readable label (used in [REDACTED:name])
#   literal     — Literal prefix for Aho-Corasick (fast pre-filter)
#   regex       — PCRE2 regex for exact matching
#   replacement — Replacement text (usually [REDACTED:name])

Example patterns

text
# AWS Access Key
aws_key|AKIA|AKIA[0-9A-Z]{16}|[REDACTED:aws_key]

# GitHub Personal Access Token
github_pat|ghp_|ghp_[A-Za-z0-9]{36}|[REDACTED:github_pat]

# Email address
email|@|[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}|[REDACTED:email]

# Custom API key for your service
my_api|MYKEY_|MYKEY_[a-f0-9]{32}|[REDACTED:my_api_key]

Built-in categories

cloud/
AWS, GCP, Azure, DigitalOcean
auth/
OAuth, JWT, session tokens
vcs/
GitHub, GitLab, Bitbucket
payment/
Stripe, PayPal, Square
database/
PostgreSQL, MySQL, MongoDB
crypto/
Private keys, mnemonics
pii/
Email, phone, SSN, IP
secrets/
Generic API keys, passwords
infra/
Docker, K8s, Terraform
communication/
Slack, Twilio, SendGrid
analytics/
Mixpanel, Segment, Amplitude
social/
Facebook, Twitter tokens

Loading patterns

bash
# Default 14 patterns (built-in, no file needed)
plumbr < app.log > safe.log

# Load all 702 patterns
plumbr -p patterns/all.txt < app.log > safe.log

# Specific category
plumbr -p patterns/cloud/aws.txt < app.log > safe.log

# Multiple: defaults + custom
plumbr -p extra.txt < app.log > safe.log

# Custom only (disable defaults)
plumbr -D -p my_patterns.txt < app.log > safe.log

Integrations

Ready-to-use configs for Docker, Kubernetes, and popular log pipelines.

Docker

yaml
# docker-compose.yml
version: '3.8'
services:
  app:
    image: your-app:latest
    volumes:
      - app-logs:/var/log/app

  plumbr:
    build:
      context: .
      dockerfile: Dockerfile
    entrypoint: >
      sh -c "tail -F /var/log/app/*.log | plumbr > /var/log/safe/app.log"
    volumes:
      - app-logs:/var/log/app:ro
      - safe-logs:/var/log/safe
    restart: unless-stopped

volumes:
  app-logs:
  safe-logs:

Kubernetes sidecar

yaml
# Your app + PlumbrC sidecar in the same pod
containers:
  - name: app
    image: your-app:latest
    volumeMounts:
      - name: logs
        mountPath: /var/log/app

  - name: plumbr
    image: plumbrc:latest
    command: ["sh", "-c", "tail -F /var/log/app/*.log | plumbr > /var/log/safe/app.log"]
    volumeMounts:
      - name: logs
        mountPath: /var/log/app
        readOnly: true
      - name: safe-logs
        mountPath: /var/log/safe
    resources:
      requests: { cpu: 50m, memory: 32Mi }
      limits: { cpu: 200m, memory: 64Mi }

systemd service

ini
[Unit]
Description=PlumbrC Log Redaction Server
After=network.target

[Service]
ExecStart=/usr/local/bin/plumbr-server --port 8080
Restart=always
User=plumbr

[Install]
WantedBy=multi-user.target

Other pipelines

Fluentd: Use the exec filter to pipe logs through plumbr. Config in integrations/fluentd/

Logstash: Use the pipe filter. Config in integrations/logstash/

Vector: Use the exec transform. Config in integrations/vector/

Building from Source

PlumbrC is pure C11 with minimal dependencies.

Prerequisites

bash
# Ubuntu/Debian
sudo apt install build-essential libpcre2-dev

# Fedora/RHEL
sudo dnf install gcc make pcre2-devel

# macOS (x86_64 only — SIMD requires x86)
brew install pcre2

Build targets

CommandOutputDescription
makebuild/bin/plumbrOptimized release binary (-O3, LTO, -march=native)
make debugbuild/bin/plumbrDebug build (-g, -O0)
make serverbuild/bin/plumbr-serverHTTP API server
make libbuild/lib/libplumbr.aStatic library
make sharedbuild/lib/libplumbr.soShared library (for Python)
make sanitizebuild/bin/plumbrASan + UBSan build
make installInstall to /usr/local/bin
make cleanRemove all build artifacts

Quick install script

bash
# One-liner install
curl -sSL https://raw.githubusercontent.com/AmritRai1234/plumbrC/main/install.sh | bash

# Or manually:
git clone https://github.com/AmritRai1234/plumbrC.git
cd plumbrC
make -j$(nproc)
sudo make install
plumbr --version

Performance

Real benchmark numbers on AMD Ryzen 5 5600X. Optimized for AMD with AVX2 SIMD.

Native C binary

ConfigSingle-threadedMulti-threaded (8T)
Default (14 patterns)3.7M lines/sec4.9M lines/sec
Full (702 patterns)2.1M lines/sec3.3M lines/sec

Comparison

MethodThroughputBottleneck
Native C binary5M lines/secCPU (AVX2 + SSE 4.2)
Python package (ctypes → C)2M lines/sec (bulk)FFI overhead (~0.5μs/line)
REST API (over internet)~0.1ms server-sideNetwork latency (100-800ms RTT)

Why it's fast

SSE 4.2 hardware pre-filter — Scans 16 bytes/cycle for trigger characters. Skips 97% of clean lines before any regex runs.
Aho-Corasick DFA — All 702 patterns matched in a single O(n) pass. No backtracking, constant time regardless of pattern count.
PCRE2 JIT — Only fires on the ~5% of lines with an Aho-Corasick hit. JIT-compiled regexes with ReDoS protection (match limits).
Zero heap allocations — Arena allocator for all memory. No malloc in the hot path. Zero GC pauses.