PlumbrC Documentation

Everything you need to detect and redact secrets from logs, streams, and text data.

CLI

Pipe or file mode

5M lines/sec

Python

pip install plumbrc

2M lines/sec (bulk)

REST API

POST /api/redact

~0.1ms server

Quick Start

Option 1: Native C Binary (fastest)

bash

# Install dependencies
sudo apt install build-essential libpcre2-dev

# Clone and build
git clone https://github.com/AmritRai1234/plumbrC.git
cd plumbrC
make -j$(nproc)

# Run
echo "api_key=AKIAIOSFODNN7EXAMPLE" | ./build/bin/plumbr
# Output: api_key=[REDACTED:aws_key]

Option 2: Python Package

bash

pip install plumbrc

python

from plumbrc import Plumbr

p = Plumbr()
result = p.redact("api_key=AKIAIOSFODNN7EXAMPLE")
print(result)  # api_key=[REDACTED:aws_key]

Option 3: REST API (for testing)

bash

curl -X POST https://plumbr.ca/api/redact \
  -H "Content-Type: application/json" \
  -d '{"text": "api_key=AKIAIOSFODNN7EXAMPLE"}'

Note: The REST API is for testing and prototyping. Network latency (100–800ms round-trip) is the bottleneck — the server processes each request in ~0.1ms. For production throughput, use the native C binary or Python package.

CLI Usage

The plumbr binary reads from stdin and writes clean output to stdout. Perfect for Unix pipelines.

Basic usage

bash

# Pipe mode — redact in real-time
tail -f /var/log/app.log | plumbr | tee safe.log

# File mode — redact an entire file
plumbr < app.log > redacted.log

# Inline test
echo "password=s3cret123 token=ghp_abc123xyz" | plumbr

Options

Flag	Description	Default
-p, --patterns FILE	Load patterns from FILE	built-in defaults
-d, --defaults	Use built-in default patterns (14)	on
-D, --no-defaults	Disable built-in defaults	—
-j, --threads N	Worker threads (0 = auto-detect)	0 (auto)
-q, --quiet	Suppress statistics output	off
-s, --stats	Print statistics to stderr	on
-H, --hwinfo	Show hardware detection info (SIMD, cores)	—
-v, --version	Show version	—
-h, --help	Show help	—

Examples

bash

# Use custom patterns only
plumbr -D -p my_patterns.txt < app.log > safe.log

# Use defaults + custom patterns
plumbr -p extra_patterns.txt < app.log > safe.log

# Load all 702 patterns
plumbr -p patterns/all.txt < app.log > safe.log

# Multi-threaded (8 workers)
plumbr -j 8 < huge.log > safe.log

# CI/CD pipeline — redact before shipping logs
docker logs my-app 2>&1 | plumbr | kubectl logs-shipper

# Hardware info
plumbr -H
# AVX2: yes, SSE 4.2: yes, Cores: 12, Threads: 24

C Library API

Embed PlumbrC directly into your C/C++ application via libplumbr. Available as both static (.a) and shared (.so) libraries.

Build the library

bash

# Static library (libplumbr.a)
make lib

# Shared library (libplumbr.so)
make shared

# Files created:
# build/lib/libplumbr.a    — static
# build/lib/libplumbr.so   — shared

Basic example

#include <libplumbr.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void) {
    // Create instance with default patterns
    libplumbr_t *p = libplumbr_new(NULL);
    if (!p) return 1;

    printf("Loaded %zu patterns\n", libplumbr_pattern_count(p));

    // Redact a single line
    const char *input = "AWS key: AKIAIOSFODNN7EXAMPLE";
    size_t out_len;
    char *safe = libplumbr_redact(p, input, strlen(input), &out_len);

    if (safe) {
        printf("Output: %s\n", safe);
        // Output: AWS key: [REDACTED:aws_key]
        free(safe);  // Caller owns the result
    }

    // Get statistics
    libplumbr_stats_t stats = libplumbr_get_stats(p);
    printf("Lines: %zu, Modified: %zu\n",
           stats.lines_processed, stats.lines_modified);

    libplumbr_free(p);
    return 0;
}

Compile & link

bash

# With static library
gcc -Iinclude my_app.c -Lbuild/lib -lplumbr -lpcre2-8 -lpthread -o my_app

# With shared library
gcc -Iinclude my_app.c -Lbuild/lib -lplumbr -lpcre2-8 -lpthread \
    -Wl,-rpath,./build/lib -o my_app

Configuration

libplumbr_config_t config = {
    .pattern_file = "/etc/plumbr/patterns/all.txt",
    .pattern_dir  = NULL,        // Or load all files from a directory
    .num_threads  = 0,           // 0 = auto-detect
    .quiet        = 1,           // Suppress stats output
};

libplumbr_t *p = libplumbr_new(&config);

In-place redaction (zero-copy)

// Redact without allocating new memory
char buffer[1024];
strcpy(buffer, "token=ghp_xxxxx");
size_t len = strlen(buffer);

int new_len = libplumbr_redact_inplace(p, buffer, len, sizeof(buffer));
if (new_len >= 0) {
    buffer[new_len] = '\0';
    printf("%s\n", buffer);
}

Batch processing

const char *inputs[] = {
    "key=AKIAIOSFODNN7EXAMPLE",
    "Normal log line",
    "email: user@example.com",
};
size_t input_lens[] = { 26, 15, 23 };
char *outputs[3];
size_t output_lens[3];

int count = libplumbr_redact_batch(p, inputs, input_lens,
                                   outputs, output_lens, 3);

for (int i = 0; i < count; i++) {
    printf("%s\n", outputs[i]);
    free(outputs[i]);
}

Full API reference

Function	Description
libplumbr_new(config)	Create instance (NULL for defaults)
libplumbr_redact(p, input, len, &out_len)	Redact a line → new string (caller frees)
libplumbr_redact_inplace(p, buf, len, cap)	In-place redaction (zero-copy)
libplumbr_redact_batch(p, ins, lens, outs, olens, n)	Batch process multiple lines
libplumbr_get_stats(p)	Get processing statistics
libplumbr_reset_stats(p)	Reset counters to zero
libplumbr_pattern_count(p)	Number of loaded patterns
libplumbr_version()	Version string
libplumbr_is_threadsafe()	Returns 1 if thread-safe
libplumbr_free(p)	Free all resources

Python Package

The plumbrc package wraps the C engine via ctypes — same speed, zero network overhead. 2M lines/sec (bulk API).

Install

bash

pip install plumbrc

Basic usage

python

from plumbrc import Plumbr

# Create a redactor (loads 14 default patterns)
p = Plumbr()
print(f"Loaded {p.pattern_count} patterns")

# Redact a single line
result = p.redact("AWS key: AKIAIOSFODNN7EXAMPLE")
print(result)  # AWS key: [REDACTED:aws_key]

# Redact multiple lines
lines = [
    "User login with api_key=sk-proj-abc123",
    "Normal log line with no secrets",
    "Email sent to user@example.com",
]
safe_lines = p.redact_lines(lines)
for line in safe_lines:
    print(line)

Context manager

python

from plumbrc import Plumbr

# Automatically cleans up C resources when done
with Plumbr() as p:
    result = p.redact("password=s3cret123")
    print(result)  # password=[REDACTED:password]

Custom patterns

python

from plumbrc import Plumbr

# Load patterns from a custom file
p = Plumbr(pattern_file="/path/to/patterns.txt")

# Or load from a directory of pattern files
p = Plumbr(pattern_dir="/etc/plumbr/patterns/")

Statistics

python

p = Plumbr()

# Process some text
for i in range(1000):
    p.redact(f"api_key=AKIAIOSFODNN7EXAMPLE line {i}")

# Check stats
stats = p.stats
print(f"Lines processed: {stats['lines_processed']}")
print(f"Lines modified:  {stats['lines_modified']}")
print(f"Patterns matched: {stats['patterns_matched']}")

# Get version
print(f"PlumbrC version: {Plumbr.version()}")

Process log files

python

from plumbrc import Plumbr

p = Plumbr()

# Redact an entire log file
with open("app.log") as f_in, open("safe.log", "w") as f_out:
    for line in f_in:
        f_out.write(p.redact(line))

print(f"Done. Stats: {p.stats}")

Error handling

python

from plumbrc import Plumbr, LibraryNotFoundError, RedactionError

try:
    p = Plumbr()
except LibraryNotFoundError:
    print("libplumbr.so not found — install PlumbrC first")

try:
    result = p.redact(some_text)
except RedactionError as e:
    print(f"Redaction failed: {e}")

REST API

Base URL: https://plumbr.ca/api

Honest note: The C engine processes each request in ~0.1ms. But network round-trip adds 100–800ms depending on your location. The REST API is best for testing and prototyping. For production throughput, use the native C binary or Python package locally.

POST /api/redact

Redact secrets from a text string. Handles multi-line input.

bash

curl -X POST https://plumbr.ca/api/redact \
  -H "Content-Type: application/json" \
  -d '{"text": "AWS key: AKIAIOSFODNN7EXAMPLE\nemail: user@example.com"}'

json

{
  "redacted": "AWS key: [REDACTED:aws_key]\nemail: [REDACTED:email]",
  "stats": {
    "lines_processed": 2,
    "lines_modified": 2,
    "patterns_matched": 2,
    "processing_time_ms": 0.066
  }
}

POST /api/redact/batch

Process multiple texts in a single request. More efficient for bulk operations.

bash

curl -X POST https://plumbr.ca/api/redact/batch \
  -H "Content-Type: application/json" \
  -d '{
    "texts": [
      "key=AKIAIOSFODNN7EXAMPLE",
      "Normal log line",
      "token=ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
    ]
  }'

GET /health

Health check and server statistics.

bash

curl https://plumbr.ca/health

json

{
  "status": "healthy",
  "version": "1.0.2",
  "server_version": "1.0.0",
  "uptime_seconds": 3600.0,
  "patterns_loaded": 14,
  "stats": {
    "requests_total": 1523,
    "requests_ok": 1520,
    "requests_error": 3,
    "bytes_processed": 524288,
    "avg_rps": 0.4
  }
}

Self-host the API server

bash

# Build the server
make server

# Run on custom port
./build/bin/plumbr-server --port 9090 --threads 4

# With all patterns
./build/bin/plumbr-server --pattern-file patterns/all.txt

# Server options:
#   --port PORT         Listen port (default: 8080)
#   --host ADDR         Bind address (default: 0.0.0.0)
#   --threads N         Worker threads (0=auto)
#   --pattern-dir DIR   Load patterns from directory
#   --pattern-file FILE Load patterns from file

Custom Patterns

PlumbrC ships with 702 patterns in 12 categories. You can also define your own.

Pattern file format

One pattern per line. Four pipe-delimited fields:

text

name|literal|regex|replacement

# Fields:
#   name        — Human-readable label (used in [REDACTED:name])
#   literal     — Literal prefix for Aho-Corasick (fast pre-filter)
#   regex       — PCRE2 regex for exact matching
#   replacement — Replacement text (usually [REDACTED:name])

Example patterns

text

# AWS Access Key
aws_key|AKIA|AKIA[0-9A-Z]{16}|[REDACTED:aws_key]

# GitHub Personal Access Token
github_pat|ghp_|ghp_[A-Za-z0-9]{36}|[REDACTED:github_pat]

# Email address
email|@|[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}|[REDACTED:email]

# Custom API key for your service
my_api|MYKEY_|MYKEY_[a-f0-9]{32}|[REDACTED:my_api_key]

Built-in categories

cloud/

AWS, GCP, Azure, DigitalOcean

auth/

OAuth, JWT, session tokens

vcs/

GitHub, GitLab, Bitbucket

payment/

Stripe, PayPal, Square

database/

PostgreSQL, MySQL, MongoDB

crypto/

Private keys, mnemonics

pii/

Email, phone, SSN, IP

secrets/

Generic API keys, passwords

infra/

Docker, K8s, Terraform

communication/

Slack, Twilio, SendGrid

analytics/

Mixpanel, Segment, Amplitude

social/

Facebook, Twitter tokens

Loading patterns

bash

# Default 14 patterns (built-in, no file needed)
plumbr < app.log > safe.log

# Load all 702 patterns
plumbr -p patterns/all.txt < app.log > safe.log

# Specific category
plumbr -p patterns/cloud/aws.txt < app.log > safe.log

# Multiple: defaults + custom
plumbr -p extra.txt < app.log > safe.log

# Custom only (disable defaults)
plumbr -D -p my_patterns.txt < app.log > safe.log

Integrations

Ready-to-use configs for Docker, Kubernetes, and popular log pipelines.

Docker

yaml

# docker-compose.yml
version: '3.8'
services:
  app:
    image: your-app:latest
    volumes:
      - app-logs:/var/log/app

  plumbr:
    build:
      context: .
      dockerfile: Dockerfile
    entrypoint: >
      sh -c "tail -F /var/log/app/*.log | plumbr > /var/log/safe/app.log"
    volumes:
      - app-logs:/var/log/app:ro
      - safe-logs:/var/log/safe
    restart: unless-stopped

volumes:
  app-logs:
  safe-logs:

Kubernetes sidecar

yaml

# Your app + PlumbrC sidecar in the same pod
containers:
  - name: app
    image: your-app:latest
    volumeMounts:
      - name: logs
        mountPath: /var/log/app

  - name: plumbr
    image: plumbrc:latest
    command: ["sh", "-c", "tail -F /var/log/app/*.log | plumbr > /var/log/safe/app.log"]
    volumeMounts:
      - name: logs
        mountPath: /var/log/app
        readOnly: true
      - name: safe-logs
        mountPath: /var/log/safe
    resources:
      requests: { cpu: 50m, memory: 32Mi }
      limits: { cpu: 200m, memory: 64Mi }

systemd service

ini

[Unit]
Description=PlumbrC Log Redaction Server
After=network.target

[Service]
ExecStart=/usr/local/bin/plumbr-server --port 8080
Restart=always
User=plumbr

[Install]
WantedBy=multi-user.target

Other pipelines

Fluentd: Use the exec filter to pipe logs through plumbr. Config in integrations/fluentd/

Logstash: Use the pipe filter. Config in integrations/logstash/

Vector: Use the exec transform. Config in integrations/vector/

Building from Source

PlumbrC is pure C11 with minimal dependencies.

Prerequisites

bash

# Ubuntu/Debian
sudo apt install build-essential libpcre2-dev

# Fedora/RHEL
sudo dnf install gcc make pcre2-devel

# macOS (x86_64 only — SIMD requires x86)
brew install pcre2

Build targets

Command	Output	Description
make	build/bin/plumbr	Optimized release binary (-O3, LTO, -march=native)
make debug	build/bin/plumbr	Debug build (-g, -O0)
make server	build/bin/plumbr-server	HTTP API server
make lib	build/lib/libplumbr.a	Static library
make shared	build/lib/libplumbr.so	Shared library (for Python)
make sanitize	build/bin/plumbr	ASan + UBSan build
make install	—	Install to /usr/local/bin
make clean	—	Remove all build artifacts

Quick install script

bash

# One-liner install
curl -sSL https://raw.githubusercontent.com/AmritRai1234/plumbrC/main/install.sh | bash

# Or manually:
git clone https://github.com/AmritRai1234/plumbrC.git
cd plumbrC
make -j$(nproc)
sudo make install
plumbr --version

Performance

Real benchmark numbers on AMD Ryzen 5 5600X. Optimized for AMD with AVX2 SIMD.

Native C binary

Config	Single-threaded	Multi-threaded (8T)
Default (14 patterns)	3.7M lines/sec	4.9M lines/sec
Full (702 patterns)	2.1M lines/sec	3.3M lines/sec

Comparison

Method	Throughput	Bottleneck
Native C binary	5M lines/sec	CPU (AVX2 + SSE 4.2)
Python package (ctypes → C)	2M lines/sec (bulk)	FFI overhead (~0.5μs/line)
REST API (over internet)	~0.1ms server-side	Network latency (100-800ms RTT)

Why it's fast

SSE 4.2 hardware pre-filter — Scans 16 bytes/cycle for trigger characters. Skips 97% of clean lines before any regex runs.

Aho-Corasick DFA — All 702 patterns matched in a single O(n) pass. No backtracking, constant time regardless of pattern count.

PCRE2 JIT — Only fires on the ~5% of lines with an Aho-Corasick hit. JIT-compiled regexes with ReDoS protection (match limits).

Zero heap allocations — Arena allocator for all memory. No malloc in the hot path. Zero GC pauses.