about summary refs log tree commit diff
path: root/tvix/eval
diff options
context:
space:
mode:
Diffstat (limited to 'tvix/eval')
-rw-r--r--tvix/eval/Cargo.toml1
-rw-r--r--tvix/eval/docs/bindings.md133
-rw-r--r--tvix/eval/src/builtins/impure.rs2
-rw-r--r--tvix/eval/src/builtins/mod.rs84
-rw-r--r--tvix/eval/src/builtins/to_xml.rs275
-rw-r--r--tvix/eval/src/errors.rs15
-rw-r--r--tvix/eval/src/io.rs6
-rw-r--r--tvix/eval/src/lib.rs2
-rw-r--r--tvix/eval/src/nix_search_path.rs2
-rw-r--r--tvix/eval/src/tests/mod.rs199
-rw-r--r--tvix/eval/src/tests/nix_tests.rs207
-rw-r--r--tvix/eval/src/tests/one_offs.rs2
-rw-r--r--tvix/eval/src/tests/tvix_tests/eval-okay-toxml-empty.exp.xml5
-rw-r--r--tvix/eval/src/tests/tvix_tests/eval-okay-toxml-empty.nix1
-rw-r--r--tvix/eval/src/tests/tvix_tests/eval-okay-toxml.exp1
-rw-r--r--tvix/eval/src/tests/tvix_tests/eval-okay-toxml.nix2
-rw-r--r--tvix/eval/src/value/json.rs8
-rw-r--r--tvix/eval/src/value/mod.rs4
-rw-r--r--tvix/eval/src/value/string.rs68
-rw-r--r--tvix/eval/src/vm/generators.rs2
-rw-r--r--tvix/eval/src/vm/mod.rs6
-rw-r--r--tvix/eval/tests/nix_oracle.rs11
22 files changed, 698 insertions, 338 deletions
diff --git a/tvix/eval/Cargo.toml b/tvix/eval/Cargo.toml
index 677ce6ab85..cc590791cc 100644
--- a/tvix/eval/Cargo.toml
+++ b/tvix/eval/Cargo.toml
@@ -30,7 +30,6 @@ smol_str = "0.2.0"
 tabwriter = "1.2"
 test-strategy = { version = "0.2.1", optional = true }
 toml = "0.6.0"
-xml-rs = "0.8.4"
 sha2 = "0.10.8"
 sha1 = "0.10.6"
 md-5 = "0.10.6"
diff --git a/tvix/eval/docs/bindings.md b/tvix/eval/docs/bindings.md
new file mode 100644
index 0000000000..2b062cb13d
--- /dev/null
+++ b/tvix/eval/docs/bindings.md
@@ -0,0 +1,133 @@
+Compilation of bindings
+=======================
+
+Compilation of Nix bindings is one of the most mind-bending parts of Nix
+evaluation. The implementation of just the compilation is currently almost 1000
+lines of code, excluding the various insane test cases we dreamt up for it.
+
+## What is a binding?
+
+In short, any attribute set or `let`-expression. Tvix currently does not treat
+formals in function parameters (e.g. `{ name ? "fred" }: ...`) the same as these
+bindings.
+
+They have two very difficult features:
+
+1. Keys can mutually refer to each other in `rec` sets or `let`-bindings,
+   including out of definition order.
+2. Attribute sets can be nested, and parts of one attribute set can be defined
+   in multiple separate bindings.
+
+Tvix resolves as much of this logic statically (i.e. at compile-time) as
+possible, but the procedure is quite complicated.
+
+## High-level concept
+
+The idea behind the way we compile bindings is to fully resolve nesting
+statically, and use the usual mechanisms (i.e. recursion/thunking/value
+capturing) for resolving dynamic values.
+
+This is done by compiling bindings in several phases:
+
+1. An initial compilation phase *only* for plain inherit statements (i.e.
+   `inherit name;`), *not* for namespaced inherits (i.e. `inherit (from)
+   name;`).
+
+2. A declaration-only phase, in which we use the compiler's scope tracking logic
+   to calculate the physical runtime stack indices (further referred to as
+   "stack slots" or just "slots") that all values will end up in.
+
+   In this phase, whenever we encounter a nested attribute set, it is merged
+   into a custom data structure that acts like a synthetic AST node.
+
+   This can be imagined similar to a rewrite like this:
+
+   ```nix
+   # initial code:
+   {
+       a.b = 1;
+       a.c = 2;
+   }
+
+   # rewritten form:
+   {
+       a = {
+           b = 1;
+           c = 2;
+       };
+   }
+   ```
+
+   The rewrite applies to attribute sets and `let`-bindings alike.
+
+   At the end of this phase, we know the stack slots of all namespaces for
+   inheriting from, all values inherited from them, and all values (and
+   optionall keys) of bindings at the current level.
+
+   Only statically known keys are actually merged, so any dynamic keys that
+   conflict will lead to a "key already defined" error at runtime.
+
+3. A compilation phase, in which all values (and, when necessary, keys) are
+   actually compiled. In this phase the custom data structure used for merging
+   is encountered when compiling values.
+
+   As this data structure acts like an AST node, the process begins recursively
+   for each nested attribute set.
+
+At the end of this process we have bytecode that leaves the required values (and
+optionally keys) on the stack. In the case of attribute sets, a final operation
+is emitted that constructs the actual attribute set structure at runtime. For
+`let`-bindings a final operation is emitted that removes these locals from the
+stack when the scope ends.
+
+## Moving parts
+
+WARNING: This documents the *current* implementation. If you only care about the
+conceptual aspects, see above.
+
+There's a few types involved:
+
+* `PeekableAttrs`: peekable iterator over an attribute path (e.g. `a.b.c`)
+* `BindingsKind`: enum defining the kind of bindings (attrs/recattrs/let)
+* `AttributeSet`: struct holding the bindings kind, the AST nodes with inherits
+  (both namespaced and not), and an internal representation of bindings
+  (essentially a vector of tuples of the peekable attrs and the expression to
+  compile for the value).
+* `Binding`: enum describing the kind of binding (namespaced inherit, attribute
+  set, plain binding of *any other value type*)
+* `KeySlot`: enum describing the location in which a key slot is placed at
+  runtime (nowhere, statically known value in a slot, dynamic value in a slot)
+* `TrackedBinding`: struct representing statically known information about a
+  single binding (its key slot, value slot and `Binding`)
+* `TrackedBindings`: vector of tracked bindings, which implements logic for
+  merging attribute sets together
+
+And quite a few methods on `Compiler`:
+
+* `compile_bindings`: entry point for compiling anything that looks like a
+  binding, this calls out to the functions below.
+* `compile_plain_inherits`: takes all inherits of a bindings node and compiles
+  the ones that are trivial to compile (i.e. just plain inherits without a
+  namespace). The `rnix` parser does not represent namespaced/plain inherits in
+  different nodes, so this function also aggregates the namespaced inherits and
+  returns them for further use
+* `declare_namespaced_inherits`: passes over all namespaced inherits and
+  declares them on the locals stack, as well as inserts them into the provided
+  `TrackedBindings`
+* `declare_bindings`: declares all regular key/value bindings in a bindings
+  scope, but without actually compiling their keys or values.
+
+  There's a lot of heavy lifting going on here:
+
+  1. It invokes the various pieces of logic responsible for merging nested
+     attribute sets together, creating intermediate data structures in the value
+     slots of bindings that can be recursively processed the same way.
+  2. It decides on the key slots of expressions based on the kind of bindings,
+     and the type of expression providing the key.
+* `bind_values`: runs the actual compilation of values. Notably this function is
+  responsible for recursively compiling merged attribute sets when it encounters
+  a `Binding::Set` (on which it invokes `compile_bindings` itself).
+
+In addition to these several methods (such as `compile_attr_set`,
+`compile_let_in`, ...) invoke the binding-kind specific logic and then call out
+to the functions above.
diff --git a/tvix/eval/src/builtins/impure.rs b/tvix/eval/src/builtins/impure.rs
index 18403fe5d8..c82b910f5f 100644
--- a/tvix/eval/src/builtins/impure.rs
+++ b/tvix/eval/src/builtins/impure.rs
@@ -37,7 +37,7 @@ mod impure_builtins {
             Ok(p) => p,
         };
         let r = generators::request_open_file(&co, path).await;
-        Ok(hash_nix_string(algo.to_str()?, r).map(Value::from)?)
+        hash_nix_string(algo.to_str()?, r).map(Value::from)
     }
 
     #[builtin("pathExists")]
diff --git a/tvix/eval/src/builtins/mod.rs b/tvix/eval/src/builtins/mod.rs
index 04a0b3dd33..5cd94bcf37 100644
--- a/tvix/eval/src/builtins/mod.rs
+++ b/tvix/eval/src/builtins/mod.rs
@@ -274,9 +274,10 @@ mod pure_builtins {
         list: Value,
     ) -> Result<Value, ErrorKind> {
         let mut separator = separator.to_contextful_str()?;
+
         let mut context = NixContext::new();
-        if let Some(sep_context) = separator.context_mut() {
-            context = context.join(sep_context);
+        if let Some(sep_context) = separator.take_context() {
+            context.extend(sep_context.into_iter())
         }
         let list = list.to_list()?;
         let mut res = BString::default();
@@ -296,13 +297,8 @@ mod pure_builtins {
             {
                 Ok(mut s) => {
                     res.push_str(&s);
-                    if let Some(ref mut other_context) = s.context_mut() {
-                        // It is safe to consume the other context here
-                        // because the `list` and `separator` are originally
-                        // moved, here.
-                        // We are not going to use them again
-                        // because the result here is a string.
-                        context = context.join(other_context);
+                    if let Some(other_context) = s.take_context() {
+                        context.extend(other_context.into_iter());
                     }
                 }
                 Err(c) => return Ok(Value::Catchable(Box::new(c))),
@@ -764,9 +760,8 @@ mod pure_builtins {
         }
 
         if let Some(origin_ctx) = origin.context_mut() {
-            // FUTUREWORK(performance): avoid this clone
-            // and extend in-place.
-            *origin_ctx = origin_ctx.clone().join(&mut ctx_elements.into());
+            origin_ctx.extend(ctx_elements)
+            // TODO: didn't we forget cases where origin had no context?
         }
 
         Ok(origin.into())
@@ -1169,8 +1164,8 @@ mod pure_builtins {
         let mut empty_string_replace = false;
         let mut context = NixContext::new();
 
-        if let Some(string_context) = string.context_mut() {
-            context = context.join(string_context);
+        if let Some(string_context) = string.take_context() {
+            context.extend(string_context.into_iter());
         }
 
         // This can't be implemented using Rust's string.replace() as
@@ -1200,8 +1195,8 @@ mod pure_builtins {
                 if string[i..i + from.len()] == *from {
                     res.push_str(&to);
                     i += from.len();
-                    if let Some(to_ctx) = to.context_mut() {
-                        context = context.join(to_ctx);
+                    if let Some(to_ctx) = to.take_context() {
+                        context.extend(to_ctx.into_iter());
                     }
 
                     // remember if we applied the empty from->to
@@ -1232,8 +1227,8 @@ mod pure_builtins {
 
             if from.is_empty() {
                 res.push_str(&to);
-                if let Some(to_ctx) = to.context_mut() {
-                    context = context.join(to_ctx);
+                if let Some(to_ctx) = to.take_context() {
+                    context.extend(to_ctx.into_iter());
                 }
                 break;
             }
@@ -1504,8 +1499,19 @@ mod pure_builtins {
         }
 
         let mut buf: Vec<u8> = vec![];
-        to_xml::value_to_xml(&mut buf, &value)?;
-        Ok(String::from_utf8(buf)?.into())
+        let context = to_xml::value_to_xml(&mut buf, &value)?;
+
+        Ok((
+            buf,
+            // FUTUREWORK: We have a distinction between an empty context, and
+            // no context at all. Fix this.
+            if !context.is_empty() {
+                Some(Box::new(context))
+            } else {
+                None
+            },
+        )
+            .into())
     }
 
     #[builtin("placeholder")]
@@ -1612,10 +1618,16 @@ pub fn pure_builtins() -> Vec<(&'static str, Value)> {
         crate::systems::llvm_triple_to_nix_double(CURRENT_PLATFORM).into(),
     ));
 
-    // TODO: implement for nixpkgs compatibility
     result.push((
         "__curPos",
-        Value::from(CatchableErrorKind::UnimplementedFeature("__curPos".into())),
+        Value::Thunk(Thunk::new_suspended_native(Box::new(move || {
+            // TODO: implement for nixpkgs compatibility
+            Ok(Value::attrs(NixAttrs::from_iter([
+                ("line", 42.into()),
+                ("column", 42.into()),
+                ("file", Value::String("/deep/thought".into())),
+            ])))
+        }))),
     ));
 
     result
@@ -1623,6 +1635,8 @@ pub fn pure_builtins() -> Vec<(&'static str, Value)> {
 
 #[builtins]
 mod placeholder_builtins {
+    use crate::NixContext;
+
     use super::*;
 
     #[builtin("unsafeDiscardStringContext")]
@@ -1671,24 +1685,17 @@ mod placeholder_builtins {
             .to_contextful_str()?;
 
         // If there's any context, we will swap any ... by a path one.
-        if let Some(ctx) = v.context_mut() {
-            let new_context: tvix_eval::NixContext = ctx
-                .iter()
-                .map(|elem| match elem {
-                    // FUTUREWORK(performance): ideally, we should either:
-                    // (a) do interior mutation of the existing context.
-                    // (b) let the structural sharing make those clones cheap.
-                    crate::NixContextElement::Derivation(drv_path) => {
-                        crate::NixContextElement::Plain(drv_path.to_string())
-                    }
-                    elem => elem.clone(),
-                })
-                .collect::<HashSet<_>>()
-                .into();
+        if let Some(c) = v.take_context() {
+            let mut context = NixContext::new();
+            context.extend(c.into_iter().map(|elem| match elem {
+                crate::NixContextElement::Derivation(drv_path) => {
+                    crate::NixContextElement::Plain(drv_path.to_string())
+                }
+                elem => elem.clone(),
+            }));
 
-            *ctx = new_context;
+            return Ok(Value::String(NixString::new_context_from(context, v)));
         }
-
         Ok(Value::from(v))
     }
 
@@ -1709,6 +1716,7 @@ mod placeholder_builtins {
         _name: Value,
         _attrset: Value,
     ) -> Result<Value, ErrorKind> {
+        // TODO: implement for nixpkgs compatibility
         generators::emit_warning_kind(
             &co,
             WarningKind::NotImplemented("builtins.unsafeGetAttrsPos"),
diff --git a/tvix/eval/src/builtins/to_xml.rs b/tvix/eval/src/builtins/to_xml.rs
index bb12cebfc9..093e127fe2 100644
--- a/tvix/eval/src/builtins/to_xml.rs
+++ b/tvix/eval/src/builtins/to_xml.rs
@@ -3,112 +3,98 @@
 //! things in nixpkgs rely on.
 
 use bstr::ByteSlice;
+use std::borrow::Cow;
 use std::{io::Write, rc::Rc};
-use xml::writer::events::XmlEvent;
-use xml::writer::EmitterConfig;
-use xml::writer::EventWriter;
 
-use crate::{ErrorKind, Value};
+use crate::{ErrorKind, NixContext, NixContextElement, Value};
 
 /// Recursively serialise a value to XML. The value *must* have been
 /// deep-forced before being passed to this function.
-pub fn value_to_xml<W: Write>(mut writer: W, value: &Value) -> Result<(), ErrorKind> {
-    let config = EmitterConfig {
-        perform_indent: true,
-        pad_self_closing: true,
-
-        // Nix uses single-quotes *only* in the document declaration,
-        // so we need to write it manually.
-        write_document_declaration: false,
-        ..Default::default()
-    };
-
+/// On success, returns the NixContext.
+pub fn value_to_xml<W: Write>(mut writer: W, value: &Value) -> Result<NixContext, ErrorKind> {
     // Write a literal document declaration, using C++-Nix-style
     // single quotes.
     writeln!(writer, "<?xml version='1.0' encoding='utf-8'?>")?;
 
-    let mut writer = EventWriter::new_with_config(writer, config);
-
-    writer.write(XmlEvent::start_element("expr"))?;
-    value_variant_to_xml(&mut writer, value)?;
-    writer.write(XmlEvent::end_element())?;
+    let mut emitter = XmlEmitter::new(writer);
 
-    // Unwrap the writer to add the final newline that C++ Nix adds.
-    writeln!(writer.into_inner())?;
+    emitter.write_open_tag("expr", &[])?;
+    value_variant_to_xml(&mut emitter, value)?;
+    emitter.write_closing_tag("expr")?;
 
-    Ok(())
+    Ok(emitter.into_context())
 }
 
 fn write_typed_value<W: Write, V: ToString>(
-    w: &mut EventWriter<W>,
-    name: &str,
+    w: &mut XmlEmitter<W>,
+    name_unescaped: &str,
     value: V,
 ) -> Result<(), ErrorKind> {
-    w.write(XmlEvent::start_element(name).attr("value", &value.to_string()))?;
-    w.write(XmlEvent::end_element())?;
+    w.write_self_closing_tag(name_unescaped, &[("value", &value.to_string())])?;
     Ok(())
 }
 
-fn value_variant_to_xml<W: Write>(w: &mut EventWriter<W>, value: &Value) -> Result<(), ErrorKind> {
+fn value_variant_to_xml<W: Write>(w: &mut XmlEmitter<W>, value: &Value) -> Result<(), ErrorKind> {
     match value {
         Value::Thunk(t) => return value_variant_to_xml(w, &t.value()),
 
         Value::Null => {
-            w.write(XmlEvent::start_element("null"))?;
-            w.write(XmlEvent::end_element())
+            w.write_open_tag("null", &[])?;
+            w.write_closing_tag("null")?;
         }
 
         Value::Bool(b) => return write_typed_value(w, "bool", b),
         Value::Integer(i) => return write_typed_value(w, "int", i),
         Value::Float(f) => return write_typed_value(w, "float", f),
-        Value::String(s) => return write_typed_value(w, "string", s.to_str()?),
+        Value::String(s) => {
+            if let Some(context) = s.context() {
+                w.extend_context(context.iter().cloned());
+            }
+            return write_typed_value(w, "string", s.to_str()?);
+        }
         Value::Path(p) => return write_typed_value(w, "path", p.to_string_lossy()),
 
         Value::List(list) => {
-            w.write(XmlEvent::start_element("list"))?;
+            w.write_open_tag("list", &[])?;
 
             for elem in list.into_iter() {
                 value_variant_to_xml(w, elem)?;
             }
 
-            w.write(XmlEvent::end_element())
+            w.write_closing_tag("list")?;
         }
 
         Value::Attrs(attrs) => {
-            w.write(XmlEvent::start_element("attrs"))?;
+            w.write_open_tag("attrs", &[])?;
 
             for elem in attrs.iter() {
-                w.write(XmlEvent::start_element("attr").attr("name", &elem.0.to_str_lossy()))?;
+                w.write_open_tag("attr", &[("name", &elem.0.to_str_lossy())])?;
                 value_variant_to_xml(w, elem.1)?;
-                w.write(XmlEvent::end_element())?;
+                w.write_closing_tag("attr")?;
             }
 
-            w.write(XmlEvent::end_element())
+            w.write_closing_tag("attrs")?;
         }
 
         Value::Closure(c) => {
-            w.write(XmlEvent::start_element("function"))?;
+            w.write_open_tag("function", &[])?;
 
             match &c.lambda.formals {
                 Some(formals) => {
-                    let mut attrspat = XmlEvent::start_element("attrspat");
+                    let mut attrs: Vec<(&str, &str)> = Vec::with_capacity(2);
                     if formals.ellipsis {
-                        attrspat = attrspat.attr("ellipsis", "1");
+                        attrs.push(("ellipsis", "1"));
                     }
                     if let Some(ref name) = &formals.name {
-                        attrspat = attrspat.attr("name", name.as_str());
+                        attrs.push(("name", name.as_str()));
                     }
 
-                    w.write(attrspat)?;
-
+                    w.write_open_tag("attrspat", &attrs)?;
                     for arg in formals.arguments.iter() {
-                        w.write(
-                            XmlEvent::start_element("attr").attr("name", &arg.0.to_str_lossy()),
-                        )?;
-                        w.write(XmlEvent::end_element())?;
+                        w.write_self_closing_tag("attr", &[("name", &arg.0.to_str_lossy())])?;
                     }
 
-                    w.write(XmlEvent::end_element())?;
+                    w.write_closing_tag("attrspat")?;
                 }
                 None => {
                     // TODO(tazjin): tvix does not currently persist function
@@ -120,17 +106,16 @@ fn value_variant_to_xml<W: Write>(w: &mut EventWriter<W>, value: &Value) -> Resu
                     // If we don't want to persist the data, we can re-parse the
                     // AST from the spans of the lambda's bytecode and figure it
                     // out that way, but it needs some investigating.
-                    w.write(XmlEvent::start_element("varpat").attr("name", /* fake: */ "x"))?;
-                    w.write(XmlEvent::end_element())?;
+                    w.write_self_closing_tag("varpat", &[("name", /* fake: */ "x")])?;
                 }
             }
 
-            w.write(XmlEvent::end_element())
+            w.write_closing_tag("function")?;
         }
 
         Value::Builtin(_) => {
-            w.write(XmlEvent::start_element("unevaluated"))?;
-            w.write(XmlEvent::end_element())
+            w.write_open_tag("unevaluated", &[])?;
+            w.write_closing_tag("unevaluated")?;
         }
 
         Value::AttrNotFound
@@ -148,7 +133,189 @@ fn value_variant_to_xml<W: Write>(w: &mut EventWriter<W>, value: &Value) -> Resu
         Value::Catchable(_) => {
             panic!("tvix bug: value_to_xml() called on a value which had not been deep-forced")
         }
-    }?;
+    };
 
     Ok(())
 }
+
+/// A simple-stupid XML emitter, which implements only the subset needed for byte-by-byte compat with C++ nix’ `builtins.toXML`.
+struct XmlEmitter<W> {
+    /// The current indentation
+    cur_indent: usize,
+    writer: W,
+    context: NixContext,
+}
+
+impl<W: Write> XmlEmitter<W> {
+    pub fn new(writer: W) -> Self {
+        XmlEmitter {
+            cur_indent: 0,
+            writer,
+            context: Default::default(),
+        }
+    }
+
+    /// Write an open tag with the given name (which is not escaped!)
+    /// and attributes (Keys are not escaped! Only attribute values are.)
+    pub fn write_open_tag(
+        &mut self,
+        name_unescaped: &str,
+        attrs: &[(&str, &str)],
+    ) -> std::io::Result<()> {
+        self.add_indent()?;
+        self.writer.write_all(b"<")?;
+        self.writer.write_all(name_unescaped.as_bytes())?;
+        self.write_attrs_escape_vals(attrs)?;
+        self.writer.write_all(b">\n")?;
+        self.cur_indent += 2;
+        Ok(())
+    }
+
+    /// Write a self-closing open tag with the given name (which is not escaped!)
+    /// and attributes (Keys are not escaped! Only attribute values are.)
+    pub fn write_self_closing_tag(
+        &mut self,
+        name_unescaped: &str,
+        attrs: &[(&str, &str)],
+    ) -> std::io::Result<()> {
+        self.add_indent()?;
+        self.writer.write_all(b"<")?;
+        self.writer.write_all(name_unescaped.as_bytes())?;
+        self.write_attrs_escape_vals(attrs)?;
+        self.writer.write_all(b" />\n")?;
+        Ok(())
+    }
+
+    /// Write a closing tag with the given name (which is not escaped!)
+    pub fn write_closing_tag(&mut self, name_unescaped: &str) -> std::io::Result<()> {
+        self.cur_indent -= 2;
+        self.add_indent()?;
+        self.writer.write_all(b"</")?;
+        self.writer.write_all(name_unescaped.as_bytes())?;
+        self.writer.write_all(b">\n")?;
+        Ok(())
+    }
+
+    #[inline]
+    fn add_indent(&mut self) -> std::io::Result<()> {
+        self.writer.write_all(&b" ".repeat(self.cur_indent))
+    }
+
+    /// Write an attribute list
+    fn write_attrs_escape_vals(&mut self, attrs: &[(&str, &str)]) -> std::io::Result<()> {
+        for (name, val) in attrs {
+            self.writer.write_all(b" ")?;
+            self.writer.write_all(name.as_bytes())?;
+            self.writer.write_all(br#"=""#)?;
+            self.writer
+                .write_all(Self::escape_attr_value(val).as_bytes())?;
+            self.writer.write_all(b"\"")?;
+        }
+        Ok(())
+    }
+
+    /// Escape the given attribute value, making sure we only actually clone the string if we needed to replace something.
+    fn escape_attr_value(s: &str) -> Cow<str> {
+        let mut last_escape: usize = 0;
+        let mut res: Cow<str> = Cow::Borrowed("");
+        // iterating via char_indices gives us the ability to index the original string slice at character boundaries
+        for (idx, c) in s.char_indices() {
+            match Self::should_escape_char(c) {
+                None => {}
+                Some(new) => {
+                    // add characters since the last escape we did
+                    res += &s[last_escape..idx];
+                    // add the escaped value
+                    res += new;
+                    last_escape = idx + 1;
+                }
+            }
+        }
+        // we did not need to escape anything, so borrow original string
+        if last_escape == 0 {
+            Cow::Borrowed(s)
+        } else {
+            // add the remaining characters
+            res += &s[last_escape..];
+            res
+        }
+    }
+
+    fn should_escape_char(c: char) -> Option<&'static str> {
+        match c {
+            '<' => Some("&lt;"),
+            '>' => Some("&gt;"),
+            '"' => Some("&quot;"),
+            '\'' => Some("&apos;"),
+            '&' => Some("&amp;"),
+            '\n' => Some("&#xA;"),
+            '\r' => Some("&#xD;"),
+            _ => None,
+        }
+    }
+
+    /// Extends the existing context with more context elements.
+    fn extend_context<T>(&mut self, iter: T)
+    where
+        T: IntoIterator<Item = NixContextElement>,
+    {
+        self.context.extend(iter)
+    }
+
+    /// Consumes [Self] and returns the [NixContext] collected.
+    fn into_context(self) -> NixContext {
+        self.context
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use bytes::buf::Writer;
+    use pretty_assertions::assert_eq;
+
+    use crate::builtins::to_xml::XmlEmitter;
+    use std::borrow::Cow;
+
+    #[test]
+    fn xml_gen() {
+        let mut buf = Vec::new();
+        let mut x = XmlEmitter::new(&mut buf);
+        x.write_open_tag("hello", &[("hi", "it’s me"), ("no", "<escape>")])
+            .unwrap();
+        x.write_self_closing_tag("self-closing", &[("tag", "yay")])
+            .unwrap();
+        x.write_closing_tag("hello").unwrap();
+
+        assert_eq!(
+            std::str::from_utf8(&buf).unwrap(),
+            r##"<hello hi="it’s me" no="&lt;escape&gt;">
+  <self-closing tag="yay" />
+</hello>
+"##
+        );
+    }
+
+    #[test]
+    fn xml_escape() {
+        match XmlEmitter::<Writer<Vec<u8>>>::escape_attr_value("ab<>c&de") {
+            Cow::Owned(s) => assert_eq!(s, "ab&lt;&gt;c&amp;de".to_string(), "escape stuff"),
+            Cow::Borrowed(s) => panic!("s should be owned {}", s),
+        }
+        match XmlEmitter::<Writer<Vec<u8>>>::escape_attr_value("") {
+            Cow::Borrowed(s) => assert_eq!(s, "", "empty escape is borrowed"),
+            Cow::Owned(s) => panic!("s should be borrowed {}", s),
+        }
+        match XmlEmitter::<Writer<Vec<u8>>>::escape_attr_value("hi!ŷbla") {
+            Cow::Borrowed(s) => assert_eq!(s, "hi!ŷbla", "no escape is borrowed"),
+            Cow::Owned(s) => panic!("s should be borrowed {}", s),
+        }
+        match XmlEmitter::<Writer<Vec<u8>>>::escape_attr_value("hi!<ŷ>bla") {
+            Cow::Owned(s) => assert_eq!(
+                s,
+                "hi!&lt;ŷ&gt;bla".to_string(),
+                "multi-byte chars are correctly used"
+            ),
+            Cow::Borrowed(s) => panic!("s should be owned {}", s),
+        }
+    }
+}
diff --git a/tvix/eval/src/errors.rs b/tvix/eval/src/errors.rs
index 652252dadf..a5f79c235f 100644
--- a/tvix/eval/src/errors.rs
+++ b/tvix/eval/src/errors.rs
@@ -10,7 +10,6 @@ use std::{fmt::Debug, fmt::Display, num::ParseIntError};
 use codemap::{File, Span};
 use codemap_diagnostic::{ColorConfig, Diagnostic, Emitter, Level, SpanLabel, SpanStyle};
 use smol_str::SmolStr;
-use xml::writer::Error as XmlError;
 
 use crate::spans::ToSpan;
 use crate::value::{CoercionKind, NixString};
@@ -194,9 +193,6 @@ pub enum ErrorKind {
     /// Invalid UTF-8 was encoutered somewhere
     Utf8,
 
-    /// Errors while serialising to XML.
-    Xml(Rc<XmlError>),
-
     /// Variant for errors that bubble up to eval from other Tvix
     /// components.
     TvixError(Rc<dyn error::Error>),
@@ -248,7 +244,6 @@ impl error::Error for Error {
                 errors.first().map(|e| e as &dyn error::Error)
             }
             ErrorKind::IO { error, .. } => Some(error.as_ref()),
-            ErrorKind::Xml(error) => Some(error.as_ref()),
             ErrorKind::TvixError(error) => Some(error.as_ref()),
             _ => None,
         }
@@ -285,12 +280,6 @@ impl From<bstr::FromUtf8Error> for ErrorKind {
     }
 }
 
-impl From<XmlError> for ErrorKind {
-    fn from(err: XmlError) -> Self {
-        Self::Xml(Rc::new(err))
-    }
-}
-
 impl From<io::Error> for ErrorKind {
     fn from(e: io::Error) -> Self {
         ErrorKind::IO {
@@ -506,8 +495,6 @@ to a missing value in the attribute set(s) included via `with`."#,
                 write!(f, "Invalid UTF-8 in string")
             }
 
-            ErrorKind::Xml(error) => write!(f, "failed to serialise to XML: {error}"),
-
             ErrorKind::TvixError(inner_error) => {
                 write!(f, "{inner_error}")
             }
@@ -823,7 +810,6 @@ impl Error {
             | ErrorKind::JsonError(_)
             | ErrorKind::NotSerialisableToJson(_)
             | ErrorKind::FromTomlError(_)
-            | ErrorKind::Xml(_)
             | ErrorKind::Utf8
             | ErrorKind::TvixError(_)
             | ErrorKind::TvixBug { .. }
@@ -870,7 +856,6 @@ impl Error {
             ErrorKind::UnexpectedArgument { .. } => "E031",
             ErrorKind::RelativePathResolution(_) => "E032",
             ErrorKind::DivisionByZero => "E033",
-            ErrorKind::Xml(_) => "E034",
             ErrorKind::FromTomlError(_) => "E035",
             ErrorKind::NotSerialisableToJson(_) => "E036",
             ErrorKind::UnexpectedContext => "E037",
diff --git a/tvix/eval/src/io.rs b/tvix/eval/src/io.rs
index f775077af8..58586030aa 100644
--- a/tvix/eval/src/io.rs
+++ b/tvix/eval/src/io.rs
@@ -16,14 +16,16 @@
 //! how store paths are opened and so on.
 
 use std::{
-    fs::File,
     io,
     path::{Path, PathBuf},
 };
 
-#[cfg(target_family = "unix")]
+#[cfg(all(target_family = "unix", feature = "impure"))]
 use std::os::unix::ffi::OsStringExt;
 
+#[cfg(feature = "impure")]
+use std::fs::File;
+
 /// Types of files as represented by `builtins.readDir` in Nix.
 #[derive(Debug)]
 pub enum FileType {
diff --git a/tvix/eval/src/lib.rs b/tvix/eval/src/lib.rs
index 845964cb7e..398da4d6e2 100644
--- a/tvix/eval/src/lib.rs
+++ b/tvix/eval/src/lib.rs
@@ -29,7 +29,7 @@ mod vm;
 mod warnings;
 
 mod nix_search_path;
-#[cfg(test)]
+#[cfg(all(test, feature = "arbitrary"))]
 mod properties;
 #[cfg(test)]
 mod test_utils;
diff --git a/tvix/eval/src/nix_search_path.rs b/tvix/eval/src/nix_search_path.rs
index 566ca12238..369c5b6857 100644
--- a/tvix/eval/src/nix_search_path.rs
+++ b/tvix/eval/src/nix_search_path.rs
@@ -197,6 +197,8 @@ mod tests {
         }
     }
 
+    // this uses StdIO, which is only available with the impure feature.
+    #[cfg(feature = "impure")]
     mod resolve {
         use crate::StdIO;
         use path_clean::PathClean;
diff --git a/tvix/eval/src/tests/mod.rs b/tvix/eval/src/tests/mod.rs
index 5a7708e298..21b5d35e6a 100644
--- a/tvix/eval/src/tests/mod.rs
+++ b/tvix/eval/src/tests/mod.rs
@@ -1,203 +1,6 @@
-use crate::{value::Value, EvalIO};
-use builtin_macros::builtins;
-use pretty_assertions::assert_eq;
-use rstest::rstest;
-use std::path::PathBuf;
-
 /// Module for one-off tests which do not follow the rest of the
 /// test layout.
 mod one_offs;
 
-#[builtins]
-mod mock_builtins {
-    //! Builtins which are required by language tests, but should not
-    //! actually exist in //tvix/eval.
-    use crate as tvix_eval;
-    use crate::generators::GenCo;
-    use crate::*;
-    use genawaiter::rc::Gen;
-
-    #[builtin("derivation")]
-    async fn builtin_derivation(co: GenCo, input: Value) -> Result<Value, ErrorKind> {
-        let input = input.to_attrs()?;
-        let attrs = input.update(NixAttrs::from_iter(
-            [
-                (
-                    "outPath",
-                    "/nix/store/00000000000000000000000000000000-mock",
-                ),
-                (
-                    "drvPath",
-                    "/nix/store/00000000000000000000000000000000-mock.drv",
-                ),
-                ("type", "derivation"),
-            ]
-            .into_iter(),
-        ));
-
-        Ok(Value::Attrs(Box::new(attrs)))
-    }
-}
-
-fn eval_test(code_path: PathBuf, expect_success: bool) {
-    std::env::set_var("TEST_VAR", "foo"); // for eval-okay-getenv.nix
-
-    eprintln!("path: {}", code_path.display());
-    assert_eq!(
-        code_path.extension().unwrap(),
-        "nix",
-        "test files always end in .nix"
-    );
-
-    let code = std::fs::read_to_string(&code_path).expect("should be able to read test code");
-
-    let mut eval = crate::Evaluation::new_impure();
-    eval.strict = true;
-    eval.builtins.extend(mock_builtins::builtins());
-
-    let result = eval.evaluate(code, Some(code_path.clone()));
-    let failed = match result.value {
-        Some(Value::Catchable(_)) => true,
-        _ => !result.errors.is_empty(),
-    };
-    if expect_success && failed {
-        panic!(
-            "{}: evaluation of eval-okay test should succeed, but failed with {:?}",
-            code_path.display(),
-            result.errors,
-        );
-    }
-
-    if !expect_success && failed {
-        return;
-    }
-    // !expect_success can also mean the output differs, so don't panic if the
-    // evaluation didn't fail.
-
-    let value = result.value.unwrap();
-    let result_str = value.to_string();
-
-    let exp_path = code_path.with_extension("exp");
-    if exp_path.exists() {
-        // If there's an .exp file provided alongside, compare it with the
-        // output of the NixValue .to_string() method.
-        let exp_str = std::fs::read_to_string(&exp_path).expect("unable to read .exp file");
-
-        if expect_success {
-            assert_eq!(
-                result_str,
-                exp_str.trim(),
-                "{}: result value representation (left) must match expectation (right)",
-                code_path.display()
-            );
-        } else {
-            assert_ne!(
-                result_str,
-                exp_str.trim(),
-                "{}: test passed unexpectedly!  consider moving it out of notyetpassing",
-                code_path.display()
-            );
-
-            // Early return here, we don't compare .xml outputs if this is a !
-            // expect_success test.
-            return;
-        }
-    }
-
-    let exp_xml_path = code_path.with_extension("exp.xml");
-    if exp_xml_path.exists() {
-        // If there's an XML file provided alongside, compare it with the
-        // output produced when serializing the Value as XML.
-        let exp_xml_str = std::fs::read_to_string(exp_xml_path).expect("unable to read .xml file");
-
-        let mut xml_actual_buf = Vec::new();
-        crate::builtins::value_to_xml(&mut xml_actual_buf, &value).expect("value_to_xml failed");
-
-        assert_eq!(
-            String::from_utf8(xml_actual_buf).expect("to_xml produced invalid utf-8"),
-            exp_xml_str,
-            "{}: result value representation (left) must match expectation (right)",
-            code_path.display()
-        );
-    }
-}
-
-// identity-* tests contain Nix code snippets which should evaluate to
-// themselves exactly (i.e. literals).
-#[rstest]
-fn identity(#[files("src/tests/tvix_tests/identity-*.nix")] code_path: PathBuf) {
-    let code = std::fs::read_to_string(code_path).expect("should be able to read test code");
-
-    let mut eval = crate::Evaluation::new(Box::new(crate::StdIO) as Box<dyn EvalIO>, false);
-    eval.strict = true;
-
-    let result = eval.evaluate(&code, None);
-    assert!(
-        result.errors.is_empty(),
-        "evaluation of identity test failed: {:?}",
-        result.errors
-    );
-
-    let result_str = result.value.unwrap().to_string();
-
-    assert_eq!(
-        result_str,
-        code.trim(),
-        "result value representation (left) must match expectation (right)"
-    )
-}
-
-// eval-okay-* tests contain a snippet of Nix code, and an expectation
-// of the produced string output of the evaluator.
-//
-// These evaluations are always supposed to succeed, i.e. all snippets
-// are guaranteed to be valid Nix code.
-#[rstest]
-fn eval_okay(#[files("src/tests/tvix_tests/eval-okay-*.nix")] code_path: PathBuf) {
-    eval_test(code_path, true)
-}
-
-// eval-okay-* tests from the original Nix test suite.
-#[cfg(feature = "nix_tests")]
-#[rstest]
-fn nix_eval_okay(#[files("src/tests/nix_tests/eval-okay-*.nix")] code_path: PathBuf) {
-    eval_test(code_path, true)
-}
-
-// eval-okay-* tests from the original Nix test suite which do not yet pass for tvix
-//
-// Eventually there will be none of these left, and this function
-// will disappear :)
-//
-// Please don't submit failing tests unless they're in
-// notyetpassing; this makes the test suite much more useful for
-// regression testing, since there should always be zero non-ignored
-// failing tests.
-#[rstest]
-fn nix_eval_okay_currently_failing(
-    #[files("src/tests/nix_tests/notyetpassing/eval-okay-*.nix")] code_path: PathBuf,
-) {
-    eval_test(code_path, false)
-}
-
-#[rstest]
-fn eval_okay_currently_failing(
-    #[files("src/tests/tvix_tests/notyetpassing/eval-okay-*.nix")] code_path: PathBuf,
-) {
-    eval_test(code_path, false)
-}
-
-// eval-fail-* tests contain a snippet of Nix code, which is
-// expected to fail evaluation.  The exact type of failure
-// (assertion, parse error, etc) is not currently checked.
-#[rstest]
-fn eval_fail(#[files("src/tests/tvix_tests/eval-fail-*.nix")] code_path: PathBuf) {
-    eval_test(code_path, false)
-}
-
-// eval-fail-* tests from the original Nix test suite.
 #[cfg(feature = "nix_tests")]
-#[rstest]
-fn nix_eval_fail(#[files("src/tests/nix_tests/eval-fail-*.nix")] code_path: PathBuf) {
-    eval_test(code_path, false)
-}
+mod nix_tests;
diff --git a/tvix/eval/src/tests/nix_tests.rs b/tvix/eval/src/tests/nix_tests.rs
new file mode 100644
index 0000000000..17968e4bdb
--- /dev/null
+++ b/tvix/eval/src/tests/nix_tests.rs
@@ -0,0 +1,207 @@
+use crate::value::Value;
+use builtin_macros::builtins;
+use pretty_assertions::assert_eq;
+use rstest::rstest;
+use std::path::PathBuf;
+
+#[builtins]
+mod mock_builtins {
+    //! Builtins which are required by language tests, but should not
+    //! actually exist in //tvix/eval.
+    use crate as tvix_eval;
+    use crate::generators::GenCo;
+    use crate::*;
+    use genawaiter::rc::Gen;
+
+    #[builtin("derivation")]
+    async fn builtin_derivation(co: GenCo, input: Value) -> Result<Value, ErrorKind> {
+        let input = input.to_attrs()?;
+        let attrs = input.update(NixAttrs::from_iter(
+            [
+                (
+                    "outPath",
+                    "/nix/store/00000000000000000000000000000000-mock",
+                ),
+                (
+                    "drvPath",
+                    "/nix/store/00000000000000000000000000000000-mock.drv",
+                ),
+                ("type", "derivation"),
+            ]
+            .into_iter(),
+        ));
+
+        Ok(Value::Attrs(Box::new(attrs)))
+    }
+}
+
+#[cfg(feature = "impure")]
+fn eval_test(code_path: PathBuf, expect_success: bool) {
+    std::env::set_var("TEST_VAR", "foo"); // for eval-okay-getenv.nix
+
+    eprintln!("path: {}", code_path.display());
+    assert_eq!(
+        code_path.extension().unwrap(),
+        "nix",
+        "test files always end in .nix"
+    );
+
+    let code = std::fs::read_to_string(&code_path).expect("should be able to read test code");
+
+    let mut eval = crate::Evaluation::new_impure();
+    eval.strict = true;
+    eval.builtins.extend(mock_builtins::builtins());
+
+    let result = eval.evaluate(code, Some(code_path.clone()));
+    let failed = match result.value {
+        Some(Value::Catchable(_)) => true,
+        _ => !result.errors.is_empty(),
+    };
+    if expect_success && failed {
+        panic!(
+            "{}: evaluation of eval-okay test should succeed, but failed with {:?}",
+            code_path.display(),
+            result.errors,
+        );
+    }
+
+    if !expect_success && failed {
+        return;
+    }
+    // !expect_success can also mean the output differs, so don't panic if the
+    // evaluation didn't fail.
+
+    let value = result.value.unwrap();
+    let result_str = value.to_string();
+
+    let exp_path = code_path.with_extension("exp");
+    if exp_path.exists() {
+        // If there's an .exp file provided alongside, compare it with the
+        // output of the NixValue .to_string() method.
+        let exp_str = std::fs::read_to_string(&exp_path).expect("unable to read .exp file");
+
+        if expect_success {
+            assert_eq!(
+                result_str,
+                exp_str.trim(),
+                "{}: result value representation (left) must match expectation (right)",
+                code_path.display()
+            );
+        } else {
+            assert_ne!(
+                result_str,
+                exp_str.trim(),
+                "{}: test passed unexpectedly!  consider moving it out of notyetpassing",
+                code_path.display()
+            );
+
+            // Early return here, we don't compare .xml outputs if this is a !
+            // expect_success test.
+            return;
+        }
+    }
+
+    let exp_xml_path = code_path.with_extension("exp.xml");
+    if exp_xml_path.exists() {
+        // If there's an XML file provided alongside, compare it with the
+        // output produced when serializing the Value as XML.
+        let exp_xml_str = std::fs::read_to_string(exp_xml_path).expect("unable to read .xml file");
+
+        let mut xml_actual_buf = Vec::new();
+        crate::builtins::value_to_xml(&mut xml_actual_buf, &value).expect("value_to_xml failed");
+
+        assert_eq!(
+            String::from_utf8(xml_actual_buf).expect("to_xml produced invalid utf-8"),
+            exp_xml_str,
+            "{}: result value representation (left) must match expectation (right)",
+            code_path.display()
+        );
+    }
+}
+
+// identity-* tests contain Nix code snippets which should evaluate to
+// themselves exactly (i.e. literals).
+#[cfg(feature = "impure")]
+#[rstest]
+fn identity(#[files("src/tests/tvix_tests/identity-*.nix")] code_path: PathBuf) {
+    use crate::EvalIO;
+
+    let code = std::fs::read_to_string(code_path).expect("should be able to read test code");
+
+    let mut eval = crate::Evaluation::new(Box::new(crate::StdIO) as Box<dyn EvalIO>, false);
+    eval.strict = true;
+
+    let result = eval.evaluate(&code, None);
+    assert!(
+        result.errors.is_empty(),
+        "evaluation of identity test failed: {:?}",
+        result.errors
+    );
+
+    let result_str = result.value.unwrap().to_string();
+
+    assert_eq!(
+        result_str,
+        code.trim(),
+        "result value representation (left) must match expectation (right)"
+    )
+}
+
+// eval-okay-* tests contain a snippet of Nix code, and an expectation
+// of the produced string output of the evaluator.
+//
+// These evaluations are always supposed to succeed, i.e. all snippets
+// are guaranteed to be valid Nix code.
+#[cfg(feature = "impure")]
+#[rstest]
+fn eval_okay(#[files("src/tests/tvix_tests/eval-okay-*.nix")] code_path: PathBuf) {
+    eval_test(code_path, true)
+}
+
+// eval-okay-* tests from the original Nix test suite.
+#[cfg(feature = "impure")]
+#[rstest]
+fn nix_eval_okay(#[files("src/tests/nix_tests/eval-okay-*.nix")] code_path: PathBuf) {
+    eval_test(code_path, true)
+}
+
+// eval-okay-* tests from the original Nix test suite which do not yet pass for tvix
+//
+// Eventually there will be none of these left, and this function
+// will disappear :)
+//
+// Please don't submit failing tests unless they're in
+// notyetpassing; this makes the test suite much more useful for
+// regression testing, since there should always be zero non-ignored
+// failing tests.
+#[cfg(feature = "impure")]
+#[rstest]
+fn nix_eval_okay_currently_failing(
+    #[files("src/tests/nix_tests/notyetpassing/eval-okay-*.nix")] code_path: PathBuf,
+) {
+    eval_test(code_path, false)
+}
+
+#[cfg(feature = "impure")]
+#[rstest]
+fn eval_okay_currently_failing(
+    #[files("src/tests/tvix_tests/notyetpassing/eval-okay-*.nix")] code_path: PathBuf,
+) {
+    eval_test(code_path, false)
+}
+
+// eval-fail-* tests contain a snippet of Nix code, which is
+// expected to fail evaluation.  The exact type of failure
+// (assertion, parse error, etc) is not currently checked.
+#[cfg(feature = "impure")]
+#[rstest]
+fn eval_fail(#[files("src/tests/tvix_tests/eval-fail-*.nix")] code_path: PathBuf) {
+    eval_test(code_path, false)
+}
+
+// eval-fail-* tests from the original Nix test suite.
+#[cfg(feature = "impure")]
+#[rstest]
+fn nix_eval_fail(#[files("src/tests/nix_tests/eval-fail-*.nix")] code_path: PathBuf) {
+    eval_test(code_path, false)
+}
diff --git a/tvix/eval/src/tests/one_offs.rs b/tvix/eval/src/tests/one_offs.rs
index 565d1dd48f..21e9144baf 100644
--- a/tvix/eval/src/tests/one_offs.rs
+++ b/tvix/eval/src/tests/one_offs.rs
@@ -5,7 +5,7 @@ fn test_source_builtin() {
     // Test an evaluation with a source-only builtin. The test ensures
     // that the artificially constructed thunking is correct.
 
-    let mut eval = Evaluation::new_impure();
+    let mut eval = Evaluation::new_pure();
     eval.src_builtins.push(("testSourceBuiltin", "42"));
 
     let result = eval.evaluate("builtins.testSourceBuiltin", None);
diff --git a/tvix/eval/src/tests/tvix_tests/eval-okay-toxml-empty.exp.xml b/tvix/eval/src/tests/tvix_tests/eval-okay-toxml-empty.exp.xml
new file mode 100644
index 0000000000..468972b2f8
--- /dev/null
+++ b/tvix/eval/src/tests/tvix_tests/eval-okay-toxml-empty.exp.xml
@@ -0,0 +1,5 @@
+<?xml version='1.0' encoding='utf-8'?>
+<expr>
+  <attrs>
+  </attrs>
+</expr>
diff --git a/tvix/eval/src/tests/tvix_tests/eval-okay-toxml-empty.nix b/tvix/eval/src/tests/tvix_tests/eval-okay-toxml-empty.nix
new file mode 100644
index 0000000000..ffcd4415b0
--- /dev/null
+++ b/tvix/eval/src/tests/tvix_tests/eval-okay-toxml-empty.nix
@@ -0,0 +1 @@
+{ }
diff --git a/tvix/eval/src/tests/tvix_tests/eval-okay-toxml.exp b/tvix/eval/src/tests/tvix_tests/eval-okay-toxml.exp
new file mode 100644
index 0000000000..9ae16de526
--- /dev/null
+++ b/tvix/eval/src/tests/tvix_tests/eval-okay-toxml.exp
@@ -0,0 +1 @@
+"<?xml version='1.0' encoding='utf-8'?>\n<expr>\n  <attrs>\n    <attr name=\"&amp;-{\">\n      <string value=\";&amp;&quot;\" />\n    </attr>\n    <attr name=\"a\">\n      <string value=\"s\" />\n    </attr>\n  </attrs>\n</expr>\n"
diff --git a/tvix/eval/src/tests/tvix_tests/eval-okay-toxml.nix b/tvix/eval/src/tests/tvix_tests/eval-okay-toxml.nix
new file mode 100644
index 0000000000..7d074048dd
--- /dev/null
+++ b/tvix/eval/src/tests/tvix_tests/eval-okay-toxml.nix
@@ -0,0 +1,2 @@
+# Check some corner cases regarding escaping.
+builtins.toXML { a = "s"; "&-{" = ";&\""; }
diff --git a/tvix/eval/src/value/json.rs b/tvix/eval/src/value/json.rs
index c48e9c1f4e..24a6bcaf6f 100644
--- a/tvix/eval/src/value/json.rs
+++ b/tvix/eval/src/value/json.rs
@@ -47,8 +47,8 @@ impl Value {
 
                 for val in l.into_iter() {
                     match generators::request_to_json(co, val).await {
-                        Ok((v, mut ctx)) => {
-                            context = context.join(&mut ctx);
+                        Ok((v, ctx)) => {
+                            context.extend(ctx.into_iter());
                             out.push(v)
                         }
                         Err(cek) => return Ok(Err(cek)),
@@ -100,8 +100,8 @@ impl Value {
                     out.insert(
                         name.to_str()?.to_owned(),
                         match generators::request_to_json(co, value).await {
-                            Ok((v, mut ctx)) => {
-                                context = context.join(&mut ctx);
+                            Ok((v, ctx)) => {
+                                context.extend(ctx.into_iter());
                                 v
                             }
                             Err(cek) => return Ok(Err(cek)),
diff --git a/tvix/eval/src/value/mod.rs b/tvix/eval/src/value/mod.rs
index c171c9a04e..dfad0cd839 100644
--- a/tvix/eval/src/value/mod.rs
+++ b/tvix/eval/src/value/mod.rs
@@ -338,8 +338,8 @@ impl Value {
             let coerced: Result<BString, _> = match (value, kind) {
                 // coercions that are always done
                 (Value::String(mut s), _) => {
-                    if let Some(ctx) = s.context_mut() {
-                        context = context.join(ctx);
+                    if let Some(ctx) = s.take_context() {
+                        context.extend(ctx.into_iter());
                     }
                     Ok((*s).into())
                 }
diff --git a/tvix/eval/src/value/string.rs b/tvix/eval/src/value/string.rs
index ceb43f1ea5..0b37d483b3 100644
--- a/tvix/eval/src/value/string.rs
+++ b/tvix/eval/src/value/string.rs
@@ -78,20 +78,19 @@ impl NixContext {
         self
     }
 
-    /// Consumes both ends of the join into a new NixContent
-    /// containing the union of elements of both ends.
-    pub fn join(mut self, other: &mut NixContext) -> Self {
-        let other_set = std::mem::take(&mut other.0);
-        let mut set: HashSet<NixContextElement> = std::mem::take(&mut self.0);
-        set.extend(other_set);
-        Self(set)
+    /// Extends the existing context with more context elements.
+    pub fn extend<T>(&mut self, iter: T)
+    where
+        T: IntoIterator<Item = NixContextElement>,
+    {
+        self.0.extend(iter)
     }
 
     /// Copies from another [NixString] its context strings
     /// in this context.
     pub fn mimic(&mut self, other: &NixString) {
         if let Some(context) = other.context() {
-            self.0.extend(context.iter().cloned());
+            self.extend(context.iter().cloned());
         }
     }
 
@@ -154,6 +153,16 @@ impl NixContext {
     }
 }
 
+impl IntoIterator for NixContext {
+    type Item = NixContextElement;
+
+    type IntoIter = std::collections::hash_set::IntoIter<NixContextElement>;
+
+    fn into_iter(self) -> Self::IntoIter {
+        self.0.into_iter()
+    }
+}
+
 /// This type is never instantiated, but serves to document the memory layout of the actual heap
 /// allocation for Nix strings.
 #[allow(dead_code)]
@@ -616,6 +625,11 @@ mod arbitrary {
 
 impl NixString {
     fn new(contents: &[u8], context: Option<Box<NixContext>>) -> Self {
+        debug_assert!(
+            !context.as_deref().is_some_and(NixContext::is_empty),
+            "BUG: initialized with empty context"
+        );
+
         // SAFETY: We're always fully initializing a NixString here:
         //
         // 1. NixStringInner::alloc sets up the len for us
@@ -707,8 +721,10 @@ impl NixString {
         let context = [self.context(), other.context()]
             .into_iter()
             .flatten()
-            .fold(NixContext::new(), |acc_ctx, new_ctx| {
-                acc_ctx.join(&mut new_ctx.clone())
+            .fold(NixContext::new(), |mut acc_ctx, new_ctx| {
+                // TODO: consume new_ctx?
+                acc_ctx.extend(new_ctx.iter().cloned());
+                acc_ctx
             });
         Self::new_context_from(context, s)
     }
@@ -719,16 +735,30 @@ impl NixString {
         //
         // Also, we're using the same lifetime and mutability as self, to fit the
         // pointer-to-reference conversion rules
-        unsafe { NixStringInner::context_ref(self.0).as_deref() }
+        let context = unsafe { NixStringInner::context_ref(self.0).as_deref() };
+
+        debug_assert!(
+            !context.is_some_and(NixContext::is_empty),
+            "BUG: empty context"
+        );
+
+        context
     }
 
-    pub(crate) fn context_mut(&mut self) -> Option<&mut NixContext> {
+    pub(crate) fn context_mut(&mut self) -> &mut Option<Box<NixContext>> {
         // SAFETY: There's no way to construct an uninitialized or invalid NixString (see the SAFETY
         // comment in `new`).
         //
         // Also, we're using the same lifetime and mutability as self, to fit the
         // pointer-to-reference conversion rules
-        unsafe { NixStringInner::context_mut(self.0).as_deref_mut() }
+        let context = unsafe { NixStringInner::context_mut(self.0) };
+
+        debug_assert!(
+            !context.as_deref().is_some_and(NixContext::is_empty),
+            "BUG: empty context"
+        );
+
+        context
     }
 
     pub fn iter_context(&self) -> impl Iterator<Item = &NixContext> {
@@ -756,12 +786,16 @@ impl NixString {
         self.context().is_some()
     }
 
+    /// This clears the context of the string, returning
+    /// the removed dependency tracking information.
+    pub fn take_context(&mut self) -> Option<Box<NixContext>> {
+        self.context_mut().take()
+    }
+
     /// This clears the context of that string, losing
     /// all dependency tracking information.
     pub fn clear_context(&mut self) {
-        // SAFETY: There's no way to construct an uninitialized or invalid NixString (see the SAFETY
-        // comment in `new`).
-        *unsafe { NixStringInner::context_mut(self.0) } = None;
+        let _ = self.take_context();
     }
 
     pub fn chars(&self) -> Chars<'_> {
@@ -849,7 +883,7 @@ impl Display for NixString {
     }
 }
 
-#[cfg(test)]
+#[cfg(all(test, feature = "arbitrary"))]
 mod tests {
     use test_strategy::proptest;
 
diff --git a/tvix/eval/src/vm/generators.rs b/tvix/eval/src/vm/generators.rs
index 79de688692..dbf7703bf0 100644
--- a/tvix/eval/src/vm/generators.rs
+++ b/tvix/eval/src/vm/generators.rs
@@ -745,6 +745,7 @@ pub async fn request_open_file(co: &GenCo, path: PathBuf) -> Box<dyn std::io::Re
     }
 }
 
+#[cfg_attr(not(feature = "impure"), allow(unused))]
 pub(crate) async fn request_path_exists(co: &GenCo, path: PathBuf) -> Value {
     match co.yield_(VMRequest::PathExists(path)).await {
         VMResponse::Value(value) => value,
@@ -755,6 +756,7 @@ pub(crate) async fn request_path_exists(co: &GenCo, path: PathBuf) -> Value {
     }
 }
 
+#[cfg_attr(not(feature = "impure"), allow(unused))]
 pub(crate) async fn request_read_dir(co: &GenCo, path: PathBuf) -> Vec<(bytes::Bytes, FileType)> {
     match co.yield_(VMRequest::ReadDir(path)).await {
         VMResponse::Directory(dir) => dir,
diff --git a/tvix/eval/src/vm/mod.rs b/tvix/eval/src/vm/mod.rs
index c10b79cd99..1fa9eba4bc 100644
--- a/tvix/eval/src/vm/mod.rs
+++ b/tvix/eval/src/vm/mod.rs
@@ -994,8 +994,8 @@ where
             }
             let mut nix_string = val.to_contextful_str().with_span(frame, self)?;
             out.push_str(nix_string.as_bstr());
-            if let Some(nix_string_ctx) = nix_string.context_mut() {
-                context = context.join(nix_string_ctx);
+            if let Some(nix_string_ctx) = nix_string.take_context() {
+                context.extend(nix_string_ctx.into_iter())
             }
         }
 
@@ -1148,7 +1148,7 @@ where
                     let mut captured_with_stack = frame
                         .upvalues
                         .with_stack()
-                        .map(Clone::clone)
+                        .cloned()
                         // ... or make an empty one if there isn't one already.
                         .unwrap_or_else(|| Vec::with_capacity(self.with_stack.len()));
 
diff --git a/tvix/eval/tests/nix_oracle.rs b/tvix/eval/tests/nix_oracle.rs
index 6bab75cfd9..3d3abc7363 100644
--- a/tvix/eval/tests/nix_oracle.rs
+++ b/tvix/eval/tests/nix_oracle.rs
@@ -30,7 +30,14 @@ fn nix_eval(expr: &str, strictness: Strictness) -> String {
         .arg(format!("({expr})"))
         .env(
             "NIX_REMOTE",
-            format!("local?root={}", store_dir.path().display()),
+            format!(
+                "local?root={}",
+                store_dir
+                    .path()
+                    .canonicalize()
+                    .expect("valid path")
+                    .display()
+            ),
         )
         .output()
         .unwrap();
@@ -49,6 +56,7 @@ fn nix_eval(expr: &str, strictness: Strictness) -> String {
 /// `NIX_INSTANTIATE_BINARY_PATH` env var to resolve the `nix-instantiate` binary) and tvix, and
 /// assert that the result is identical
 #[track_caller]
+#[cfg(feature = "impure")]
 fn compare_eval(expr: &str, strictness: Strictness) {
     let nix_result = nix_eval(expr, strictness);
     let mut eval = tvix_eval::Evaluation::new_pure();
@@ -69,6 +77,7 @@ fn compare_eval(expr: &str, strictness: Strictness) {
 macro_rules! compare_eval_tests {
     ($strictness:expr, {}) => {};
     ($strictness:expr, {$(#[$meta:meta])* $test_name: ident($expr: expr); $($rest:tt)*}) => {
+        #[cfg(feature = "impure")]
         #[test]
         $(#[$meta])*
         fn $test_name() {