head	1.9;
access;
symbols
	pkgsrc-2026Q1:1.7.0.2
	pkgsrc-2026Q1-base:1.7
	pkgsrc-2025Q4:1.5.0.2
	pkgsrc-2025Q4-base:1.5
	pkgsrc-2025Q3:1.2.0.2
	pkgsrc-2025Q3-base:1.2;
locks; strict;
comment	@# @;


1.9
date	2026.04.16.11.37.03;	author pin;	state Exp;
branches;
next	1.8;
commitid	sPeZDg2xCnt41bCG;

1.8
date	2026.04.09.18.12.26;	author pin;	state Exp;
branches;
next	1.7;
commitid	38pkdPO8pttFqjBG;

1.7
date	2026.03.08.13.28.11;	author pin;	state Exp;
branches;
next	1.6;
commitid	zTob0LhRMQNVSaxG;

1.6
date	2026.02.17.13.55.58;	author pin;	state Exp;
branches;
next	1.5;
commitid	54K8PC7HVDOjEJuG;

1.5
date	2025.11.29.20.21.13;	author pin;	state Exp;
branches;
next	1.4;
commitid	6LaNADyH9XdVlukG;

1.4
date	2025.11.18.13.41.20;	author pin;	state Exp;
branches;
next	1.3;
commitid	xY5lO98qKM1Fu2jG;

1.3
date	2025.09.22.13.12.58;	author pin;	state Exp;
branches;
next	1.2;
commitid	wiAOREXpQ1HwaIbG;

1.2
date	2025.07.31.11.49.48;	author pin;	state Exp;
branches;
next	1.1;
commitid	9V8TJEtQtxTBpT4G;

1.1
date	2025.07.26.08.58.27;	author pin;	state Exp;
branches;
next	;
commitid	NryrI6RLNRcNCe4G;


desc
@@


1.9
log
@textproc/xan: update to 0.57.1

Fixes

    Fixing xan sort --check -n & xan dedup --check -n printed report.
    Fixing xan parallel cat -nS.
    Fixing CSV parsing rare edge case panics.
    Fixing commands relying on zero-copy CSV parsing for performance.

Performance

    Improving performance of xan flatten, xan view, xan to & xan network.
    Faster xan network -f nodelist in some cases.
@
text
@# $NetBSD: Makefile,v 1.8 2026/04/09 18:12:26 pin Exp $

DISTNAME=	xan-0.57.1
CATEGORIES=	textproc
MASTER_SITES=	${MASTER_SITE_GITHUB:=medialab/}
GITHUB_TAG=	${PKGVERSION_NOREV}

MAINTAINER=	pkgsrc-users@@NetBSD.org
HOMEPAGE=	https://github.com/medialab/xan/
COMMENT=	CSV handling tool
LICENSE=	unlicense

RUST_REQ=	1.83.0
USE_LANGUAGES=	c
USE_TOOLS+=	pax

INSTALLATION_DIRS+=	share/doc/xan

post-install:
	${INSTALL_DATA} ${WRKSRC}/README.md ${DESTDIR}${PREFIX}/share/doc/xan
	cd ${WRKSRC}/docs && ${PAX} -pp -rw * ${DESTDIR}${PREFIX}/share/doc/xan

.include "cargo-depends.mk"

.include "../../lang/rust/cargo.mk"
.include "../../mk/bsd.pkg.mk"
@


1.8
log
@textproc/xan: update to 0.57.0

The temporal update.

Breaking

    xan select -n will not error anymore on empty inputs and, generally, empty files should not trigger selection errors when using commands with -n/--no-headers.
    xan heatmap -C/--cram becomes a flag accepting either auto, always or never.
    Dropping -C short flag for xan sort --cells (it could be confused with --columns or --check).
    Completely overhauled how datetimes work in moonblade.
    xan separate will not trim splitted values with some modes by default anymore.
    Dropping xan network --stats in favor of -f stats.
    -D becomes short flag for xan network --degrees instead of --disjoint-keys.
    xan separate --capture-groups is dropped in favor of -c/--captures & -C/--all-captures.
    Renaming xan search --breakdown shortflag to -b to allow for future -B/--before-context.

Features

    Adding xan matrix count & xan matrix adj.
    Adding front_coding window function.
    Timestamp support with xan plot -LT.
    Adding xan rename -n/--no-headers support for -p/--prefix & -x/--suffix.
    Adding xan from -f parquet (requires the parquet feature).
    Adding xan to latex.
    Adding xan top -L/--lexicographic.
    Adding xan heatmap flags: -w/--width, -F/--fill, -a/--align, -U/--unit, -Z/--show-normalized, -A/--ascii, -l/--label & -v/--values.
    Adding new gradients to xan heatmap.
    Adding range & repeat moonblade functions.
    Adding xan sort --columns.
    Adding xan view -T/--tee.
    Adding now, fractional_days, to_timezone, to_local_timezone, with_timezone, with_local_timezone, without_timezone, to_timestamp, to_timestamp_ms, from_timestamp, from_timestamp_ms, span, date & time moonblade functions.
    Better type inference with xan stats, and the type & types aggregation functions, now including more types for temporal values (zoned_datetime, datetime, date & time).
    Adding xan input -T/--tolerant.
    Adding xan separate --trim.
    Adding xan grep -B/--before-context & -A/--after-context.
    Adding xan network -f=components, -S/--simple, --union-find, --minify & --sample-size <n>.
    Adding xan plot --timezone.
    Adding xan hist --log shorthand flag for --scale=log.
    Adding log_dist sparkline column to xan stats -q output.
    Adding dist & log_dist aggregation functions.
    Adding xan search -L/--levenshtein <k> & -D/--damerau-levenshtein <k>.

Fixes

    Fixing xan separate automatic column prefix extraction.
    Fixing xan heatmap -n.
    Fixing xan heatmap --repeat-headers --cram always not repeating x-axis legend.
    Fixing correctness of xan plot -T and increase resolution to microseconds.
    Fixing moonblade column-related functions returning incorrect results wrt -n/--no-headers.
    xan search should now properly error when handling invalid utf-8 in relevant modes.
    Fixing xan search -iR & xan search -i --replacement-column.

Performance

    Improving performance of xan complete, xan top, xan plot -T & xan hist.
    Improving overall performance of xan network.
    Slightly optimizing xan vocab by allowing needless heap allocation & indirection.
    Improving performance and memory usage of xan separate.

Quality of Life

    Adding proper help to xan heatmap.
@
text
@d1 1
a1 1
# $NetBSD: Makefile,v 1.7 2026/03/08 13:28:11 pin Exp $
d3 1
a3 1
DISTNAME=	xan-0.57.0
@


1.7
log
@textproc/xan: update to 0.56.0

Features

    Adding xan bisect.
    Adding xan flatten -N/--non-empty.
    Adding the soundex, refined_soundex & phonogram moonblade functions for phonetic encoding.

Fixes

    Fixing xan to (md|html) --no-headers.
    Fixing xan plot -R/--regression-line.

Quality of Life

    Adding xan to markdown as an alias for xan to md.
    xan flatten & xan view will stop masquerading trimmed empty cells as empty.
@
text
@d1 1
a1 1
# $NetBSD: Makefile,v 1.6 2026/02/17 13:55:58 pin Exp $
d3 1
a3 1
DISTNAME=	xan-0.56.0
@


1.6
log
@textproc/xan: update to 0.55.0

Breaking

    Changing how xan separate generates default column names.
    xan from -f=(json|ndjson|jsonl) will now emit column in input order by default.
    Changing xan to -B/--buffer-size to --sample-size to harmonize flag names with xan from.

Features

    Adding the xan complete command.
    Adding an optional unit to ceil, floor, round & trunc moonblade function. E.g. floor to nearest decade: floor(year, 10).
    Adding basename & dirname moonblade functions.
    Adding parse_py_literal moonblade functions. Useful to deal with files dubiously serialized using pandas.
    Adding xan view --repeat-headers=(auto|always|never).
    Adding xan view --reveal-whitespace=(auto|always|never).
    Adding --color support to XAN_VIEW_ARGS.
    Adding xan from -f json --sample-size -1 to sample the whole file.
    Adding xan from -f json --single-object.
    Adding xan from --sort-keys.
    Adding xan to (json|ndjson|jsonl) --sample-size -1 to sample the whole file.
    Adding xan to (json|ndjson|jsonl) --strings flag.
    Adding xan separate --prefix.
    Adding xan heatmap -C short flag for --cram.
    Adding xan heatmap --repeat-headers.
    Adding rank, cume_dist, percent_rank and ntile window functions.
    Adding xan help --color.

Fixes

    Fixing xan select -ne incorrectly emitting headers.

Quality of Life

    xan view -p will not print bottom header anymore by default.
    xan view will not reveal problematic whitespace if output is not colored anymore, by default.
    Better xan hist error messages and help.
    Testing more file name variants when searching for a .gzi index.
@
text
@d1 1
a1 1
# $NetBSD: Makefile,v 1.5 2025/11/29 20:21:13 pin Exp $
d3 1
a3 1
DISTNAME=	xan-0.55.0
@


1.5
log
@textproc/xan: update to 0.54.1

Fixes
 - Fixing xan freq --groupby incorrectly unescaping group cells.
 - Fixing help related to xan pivot & xan unpivot.
 - Upgrading simd-csv to get safety fixes.
@
text
@d1 1
a1 1
# $NetBSD: Makefile,v 1.4 2025/11/18 13:41:20 pin Exp $
d3 1
a3 1
DISTNAME=	xan-0.54.1
@


1.4
log
@textproc/xan: update to 0.54.0

Breaking

    Bumping MSRV to 1.83.0.
    Dropping xan plot -Y/--add-series. It is now possible to select multiple columns as <y> in xan plot <x> <y> instead.
    Dropping the -C/--force-colors flag in flatten, heatmap, hist, plot and view in favor of the more standardized and flexible --color=(auto|never|always) flag.
    xan join will now automatically drop joined columns from one the files when it is obviously safe to do so.
    xan behead & xan rename do not normalize the output anymore to be as fast as possible.
    The new SIMD CSV parser might not deal with CSV irregular cases the same way rust-csv did. In any case, xan input will still continue to use rust-csv.
    xan slice -B/--byte-offset & xan slice -A/--accumulate are now mutually exclusive.
    xan input has been overhauled.
    Dropping xan count --sample-size.
    Overhauling xan fixlengths to accept streams by shifting default from double-pass read to buffering the whole stream into memory.
    xan plot --x-scale log & --y-scale log are now natural log. Use log10 for the base10 log as before.
    Dropping xan reverse -m/--in-memory flag. Behavior is now automatically detected.
    Dropping xan shuffle -m/--in-memory flag. Loading the file into memory is now the default. The xan shuffle -e/--external flag has been added if
    you want the old default behavior.
    xan bins now outputs <empty> values instead of <nulls>.
    Overhauling xan bins. The default is now to find nice boundaries for the bins. Use -e/--exact to revert to the old behavior. The default number of bins is now 10, and won't use Freedman-Diaconis rule by default. A -H/--heuristic flag has been added if you want to automatically select a suitable number of bins.

Features

    Adding xan flatten -F/--flatter.
    xan pivot can now target multiple columns.
    Adding the xan grep command for fast but coarse filtering.
    Adding xan search -f/--flag.
    Adding xan map -F/--filter.
    xan search -B/--breakdown now consolidates the results when multiple patterns have a same name.
    Adding xan flatten --row-separator.
    Adding xan flatten --csv.
    Adding xan headers --color.
    Adding the xan join <columns> <input1> <input2> arity as a convenience when joined column names are the same in both inputs.
    Adding xan join -D/--drop-key=(none|both|left|right).
    Adding xan fuzzy-join -D/--drop-key=(none|both|left|right).
    Adding xan plot -A/--aggregate.
    Adding support for plural selection clauses in both xan select -e & xan map e.g. xan map 'full_name.split(" ") as (first_name, last_name).
    Adding xan search -P/--add-pattern.
    Adding xan groupby -M/--along-matrix.
    Adding xan groupby -T/--total.
    Adding support for .ndjson & .jsonl files. Those are considered as headless TSV files with null byte quoting so you can easily use them with xan commands.
    Adding out-of-the-box support for .vcf, .sam, .bed, .gtf & .gff2 files.
    Adding a xan cat cols alias to xan cat columns.
    Adding zstd support.
    Adding earliest & latest moonblade functions.
    Adding xan dedup -f/--flag.
    Adding -k short flag for xan dedup --keep-duplicates, and -C short flag for xan dedup --choose.
    Adding xan fixlengths -H/--trust-header.
    Adding xan separate.
    Adding full log scale support to xan plot.
    Adding xan hist --scale.
    xan window is now able to run total aggregations.
    Adding thousands_sep, comma and significance kwargs to numfmt moonblade function.

Fixes

    Fixing xan dedup --check bug where the first record was ignored.
    Fixing xan hist -D when a same date is found multiple times.
    Fixing xan from -f xls datetime conversion.
    Fixing xan flatten & xan view when column names contain line breaks.
    Fixing invalid argument parsing error being printed to stdout instead of stderr.
    Fixing xan progress SIGINT corrupting output.
    Fixing xan enum -A/--accumulate.
    Fixing xan from -f tar when tarball archive is not gzipped.
    Fixing min & max moonblade function when passing a list of numbers.
    Fixing xan flatten -H edge cases.
    Fixing commands requiring seekable streams accepting unindexed compressed files by error.
    Fixing xan plot --count --y-scale log.

Performance

    Wildly improving performance of most of xan commands by leveraging a novel SIMD CSV parser/writer.
    Improving performance of xan from -f txt & xan from -f npy.
    Improving memory footprint of hash-based commands (e.g. frequency, groupby, dedup etc.).
    Improving performance of xan progress, xan range, xan enum, xan behead, xan rename.

Quality of Life

    xan parallel cat now flushing more consistently.
    Better highlighting of problematic strings in xan flatten, xan view & xan headers.
    xan parallel will now generally stop as soon as an error is detected in a subprocess and cleanly report errors.
    Better argv parsing error UX in general.
    The -p flag will now avoid going further than 16 to avoid issues on server with many CPUs where hogging the resources is an issue and where using too much threads at once could hurt performance. The -t flag remain available to tweak the number of threads.
    xan hist will now dim bars having a 0 count so you can easily distinguish them from non-empty bars.
@
text
@d1 1
a1 1
# $NetBSD: Makefile,v 1.3 2025/09/22 13:12:58 pin Exp $
d3 1
a3 1
DISTNAME=	xan-0.54.0
@


1.3
log
@textproc/xan: update to 0.53.0

Breaking

    xan partition now normalizes filenames to lowercase to correctly deal with case-insensitive filesystems. xan partition also gets a related -C/--case-sensitive flag.

Features

    Adding all and any moonblade higher-order functions.
    Allowing moonblade printf function to be called with lists.
    Adding -f/--evaluate-file flag to map, filter, flatmap & transform commands.
    Adding xan map -O/--overwrite.

Fixes

    Fixing xan top -T/--ties edge case.
    Fixing broken pipe panics for some commands.
    Dropping remnant dbg! macro when reading files in reverse.

Performance

    Using jemallocator for musl builds.

Quality of Life

    Better moonblade printf function error messages.
@
text
@d1 1
a1 1
# $NetBSD: Makefile,v 1.2 2025/07/31 11:49:48 pin Exp $
d3 1
a3 1
DISTNAME=	xan-0.53.0
d13 1
a13 3
#RUST_REQ=	1.90.0
#Upstream does not state the required MSRV.
#This package is confirmed to build with Rust 1.90.0 on amd64
@


1.2
log
@textproc/xan: update to 0.52.0

Breaking

    xan search --count will not emit rows with 0 matches anymore unless --left is used.

Features

    xan transform is now able to work on a selection of columns, rather than on a single column.
    Adding the xan unpivot command.
    Adding the xan pivot command.
    Adding xan join --semi & xan join --anti commands.
    Adding xan slice --raw.
    Adding default expression argument to lead & lag window functions.
    Adding shlex_split, cmd and shell moonblade functions.
    Adding aarch64-apple-darwin and aarch64-unknown-linux-gnu to CI builds.
    Adding to_fixed moonblade function.
    Adding decimal places optional argument to ratio & percentage aggregation functions.
    Adding frac & dense_rank aggregation functions to xan window.

Fixes

    Loosening xan partition sanitizer to allow hyphens, dashes and points.
    Fixing xan parallel --progress display.
    Fixing logic error in xan search -B when using without --left.
    Fixing xan parallel cat when working on file chunks with -P or -H.
    Fixing moonblade list/string slicing with some combinations of negatives indices.
    Fixing moonblade split function not using regex patterns properly.
    Fixing moonblade parsing wrt regex patterns and comments (using a regex pattern containing # was not possible).
    Fixing lead window aggregation function when working on any column that is not the first one.
    Fixing xan view -S/--significance being overzealous, especially wrt integers.

Performance

    Improving performance of xan parallel when working on file chunks.

Quality of Life

    xan headers now report more useful information when files have diverging headers.
    Better error messages for read_json and parse_json moonblade functions.
    xan view -p will not engage pager when input errored or is empty.
    xan select -e & -f become boolean flags instead of error-inducing invocation variants.
@
text
@d1 1
a1 1
# $NetBSD: Makefile,v 1.1 2025/07/26 08:58:27 pin Exp $
d3 1
a3 1
DISTNAME=	xan-0.52.0
d13 1
a13 1
#RUST_REQ=		1.88.0
d15 1
a15 1
#This package is confirmed to build with Rust 1.88.0 on amd64
@


1.1
log
@textproc/xan: import package

Packaged in wip by wiz.

`xan` is a command line tool that can be used to process CSV files
directly from the shell.

It has been written in Rust to be as fast as possible, use as little
memory as possible, and can easily handle very large CSV files
(Gigabytes). It is also able to leverage parallelism (through
multithreading) to make some tasks complete as fast as your computer
can allow.

It can easily preview, filter, slice, aggregate, sort, join CSV
files, and exposes a large collection of composable commands that
can be chained together to perform a wide variety of typical tasks.

`xan` also leverages its own expression language so you can perform
complex tasks that cannot be done by relying on the simplest
commands. This minimalistic language has been tailored for CSV data
and is faster than evaluating typical dynamically-typed languages
such as Python, Lua, JavaScript etc.
@
text
@d1 1
a1 1
# $NetBSD$
d3 1
a3 1
DISTNAME=	xan-0.51.0
@

