Skip to content

Conversation

@aaronpowell
Copy link

This will help solve asg017/sqlite-vec#193

I'm not 100% sure that the files will be in exactly the right place for NuGet, I need to review it in context of dotnet/efcore#35617 a bit more, but this was my first pass at writing some rust and trying to get it packaging a package.

The folders output in builds might not match the way we store RIDs in .NET, so we'll map them across
@krwq
Copy link

krwq commented Mar 7, 2025

The package layout you sent me looks almost good, you're missing native subfolder - it's currently runtimes/<rid>/<lib-file> but should be runtimes/<rid>/native/<lib-file>

I'd also recommend matching version (the one you sent me is 0.0.0-alpha.1 but should be likely 0.1.6, 0.1.7-alpha.1 or 0.1.7-alpha.2)

Co-authored-by: Krzysztof Wicher <[email protected]>
@asg017
Copy link
Owner

asg017 commented Mar 7, 2025

Hey this is awesome! I'm at a conference this weekend but will review this when I'm back, definitely want to get dotnet support here.

Thank you again for setting this all up!

@krwq
Copy link

krwq commented Mar 10, 2025

@aaronpowell I believe there will be some tiny bit more to make that work on (old) .NET Framework - I'll try to figure out the details - we can still go ahead with this as is though and do this work in further iteration

@krwq
Copy link

krwq commented Mar 10, 2025

@krwq
Copy link

krwq commented Mar 10, 2025

so for netfx the simplest thing which works (in 64-bit) is:

<?xml version="1.0" encoding="utf-8"?>
<Project ToolsVersion="4.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
  <ItemGroup>
    <Content Include="$(MSBuildThisFileDirectory)..\..\runtimes\win-x64\native\vec0.dll">
      <Link>vec0.dll</Link>
      <CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
      <Pack>false</Pack>
    </Content>
  </ItemGroup>
</Project>

under buildTransitive\net46 in the nupkg. The only problem with that is that netfx with AnyCPU which is a default is x86 and sqlite-vec provides only x64 asset.

@aaronpowell
Copy link
Author

so for netfx the simplest thing which works (in 64-bit) is:

<?xml version="1.0" encoding="utf-8"?>
<Project ToolsVersion="4.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
  <ItemGroup>
    <Content Include="$(MSBuildThisFileDirectory)..\..\runtimes\win-x64\native\vec0.dll">
      <Link>vec0.dll</Link>
      <CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
      <Pack>false</Pack>
    </Content>
  </ItemGroup>
</Project>

under buildTransitive\net46 in the nupkg. The only problem with that is that netfx with AnyCPU which is a default is x86 and sqlite-vec provides only x64 asset.

so it needs to include a props file in the buildTransitive\net46 folder?

@krwq
Copy link

krwq commented Mar 11, 2025

@krwq
Copy link

krwq commented Mar 11, 2025

I'll try to get more details from someone more knowledgeable in this, I'm not sure I yet fully understand what needs to happen exactly - I'm sure though that when I do #12 (comment) and name the file buildTransitive\net46\sqlite-vec.targets that ends up finding the extension on netfx when I change arch to x64 (but default is x86 so it finds the dll but since it's for wrong arch it will fail loading). The scenarios we're likely missing with this is mono/xamarin. I suspect we'd need x86 build of sqlite-vec in order to get most of the scenarios to work.

I guess we could also copy what https://www.nuget.org/packages/Microsoft.ML does which was given to me as the correct way to do it example - they have both props:

<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">

  <!--
  NuGet packages.config doesn't support native assemblies automatically,
  so copy the native assemblies to the output directory.
  -->
  <ItemGroup Condition="Exists('packages.config') OR
                        Exists('$(MSBuildProjectName).packages.config') OR
                        Exists('packages.$(MSBuildProjectName).config')">
    <Content Include="$(MSBuildThisFileDirectory)\..\..\runtimes\win-x64\native\*.dll"
             Condition="'$(PlatformTarget)' == 'x64'">
      <CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
      <Visible>false</Visible>
      <Link>%(Filename)%(Extension)</Link>
    </Content>

    <Content Include="$(MSBuildThisFileDirectory)\..\..\runtimes\win-x86\native\*.dll"
             Condition="'$(PlatformTarget)' == 'x86'">
      <CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
      <Visible>false</Visible>
      <Link>%(Filename)%(Extension)</Link>
    </Content>

    <Content Include="$(MSBuildThisFileDirectory)\..\..\runtimes\win-arm64\native\*.dll"
             Condition="'$(PlatformTarget)' == 'arm64'">
      <CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
      <Visible>false</Visible>
      <Link>%(Filename)%(Extension)</Link>
    </Content>
  </ItemGroup>

</Project>

and targets:

<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">

  <PropertyGroup>
    <EnableMLUnsupportedPlatformTargetCheck Condition="'$(EnableMLUnsupportedPlatformTargetCheck)' == ''">true</EnableMLUnsupportedPlatformTargetCheck>
  </PropertyGroup>

  <Target Name="_CheckForUnsupportedPlatformTarget"
          Condition="'$(EnableMLUnsupportedPlatformTargetCheck)' == 'true'"
          AfterTargets="_CheckForInvalidConfigurationAndPlatform">

    <!--
    Special case .NET Core portable applications.  When building a portable .NET Core app,
    the PlatformTarget is empty, and you don't know until runtime (i.e. which dotnet.exe)
    what processor architecture will be used.
    -->
    <Error Condition="('$(PlatformTarget)' != 'x64' AND '$(PlatformTarget)' != 'x86') AND
                      ('$(OutputType)' == 'Exe' OR '$(OutputType)'=='WinExe') AND
                      !('$(TargetFrameworkIdentifier)' == '.NETCoreApp' AND '$(PlatformTarget)' == '')"
           Text="Microsoft.ML currently supports 'x64' and 'x86' processor architectures. Please ensure your application is targeting 'x64' or 'x86'." />
  </Target>
  
</Project>

they also put it under build\netstandard2.0 but I'm not sure why it's in build rather than buildTransitive (perhaps I understand this wrong - for simple case where we directly reference nuget that probably won't matter but it might make a difference when you reference library which references nuget but I'd also assume they did it correctly. I'll try to get more details.

It seems this mechanism is needed when project uses package.config.

@krwq
Copy link

krwq commented Mar 11, 2025

@asg017 I don't want to bore you with above :-) TL;DR; we'll probably need Windows x86 asset to get this package to correctly work with most of the .NET versions because old ones use x86 by default. Do you think it would be feasible request?

@krwq
Copy link

krwq commented Mar 11, 2025

@aaronpowell I just validated the targets/props from ML.NET just work without modifications (minus error message when native asset is missing needs to be modified because it mentions ML.NET) - it also prints error message that asset is missing for this architecture (only on netfx though)

@aaronpowell
Copy link
Author

I've update the PR to include generating the props and targets files.

Co-authored-by: Krzysztof Wicher <[email protected]>
Copy link

@krwq krwq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@roji
Copy link

roji commented Mar 18, 2025

@asg017 this PR seems ready for merging - we're in the process of building samples for .NET and SQLite and would love to be able to show the streamlined experience! Will you be able to give this your attention in the next few days?

Note that you'll also need to publish the package for consumption by .NET users, on nuget.org. We can help with that if you want.

@asg017
Copy link
Owner

asg017 commented Mar 18, 2025

Hey @roji + @aaronpowell , thanks for all your work here! Definitely want to get this in.

I checked out this branch and ran it with the new nuget = {} configuration, here's the output: https://delicate-dream-7407.fly.storage.tigris.dev/sqlite-sample.0.0.1.nupkg

I'm not familiar with dotnet, so I have a few questions:

  • Is there a way to "install" the above sqlite-sample.0.0.1.nupkg file in a dotnet project? With Python I can pip install a wheel file as-is, but not sure about it in dotnet environments
  • Once the package is installed, what's the minimum dotnet code needed to do a "hello world" (ie, load the extension + call a new SQL function)?
  • Are there multiple dotnet SQLite client libraries to worry about, or is there a "standard" SQLite package that the community has rallied around?

If you want to build this sqlite-sample yourself, it's defined in the sample/ directory, and requires zig. Run build.sh and it'll compile the sqlite-sample.c extension for multiple platforms inside sample/dist, then you can point sqlite-dist to it like so:

cargo run -- \
  sample/sqlite-dist.toml \
  --input sample/dist/ \
  --output tmp \
  --version 0.0.1

Then tmp will contain all the packed distributions of the extension, pip/npm/etc. To try out this branch, add nuget = {} to sample/sqlite-dist.toml.

(apologies for the hackiness - I'll be documenting the sqlite-sample workflow soon)

Also - I'll look into getting x64 binaries built for sqlite-vec

@asg017
Copy link
Owner

asg017 commented Mar 18, 2025

(to build the samples with zig, you may need sqlite3ext.h inside sample/vendor, which you can download form here: https://www.sqlite.org/2025/sqlite-amalgamation-3490100.zip )

@aaronpowell
Copy link
Author

Here's a super slimmed down version of how to use the extension NuGet package you created above @asg017. The only thing required would be a local install of .NET 8 (link for installers).

sqlite-extension-demo.zip

This is the code:

using Microsoft.Data.Sqlite;

var connectionString = "Data Source=:memory:";
using var connection = new SqliteConnection(connectionString);

connection.LoadExtension("sample0"); // load extension from NuGet package

connection.Open();

var command = connection.CreateCommand();
command.CommandText = "SELECT sample_version()";

var result = command.ExecuteScalar();
Console.WriteLine($"SQLite sample extension version: {result}");

The one thing you'll note is that the connection.LoadExtension call is sample0, and I've renamed (and updated the ID in the metadata) the NuGet package to be sample0, as the way the .NET loader works is that it'll use the package ID as the filename to load.

@krwq I wonder if we should have an overload on LoadExtension that allows you to specify the filename and package ID separately, or is that possible (it didn't seem it in my quick testing).

@krwq
Copy link

krwq commented Mar 19, 2025

@aaronpowell the API is a thin shim over sqlite3_load_extension which will work with full path, path without extension or just a file name (which will search in its default directories). The fix I made to efcore made it also searches for directories with native binaries specific to .NET. But to answer your question you don't pass name anywhere but you can pass in name of the entry point in your extension with second optional argument.

@asg017
Copy link
Owner

asg017 commented Mar 19, 2025

Thank you @aaronpowell for sharing!

So in the other bindings, we have a pattern of exposing a programming language-level load function that abstracts out some details away. For example, in Python:

import sqlite3
import sqlite_vec

db = sqlite3.connect(":memory:")
db.enable_load_extension(True)
sqlite_vec.load(db)

And JavaScript:

import * as sqliteVec from "sqlite-vec";
import Database from "better-sqlite3";

const db = new Database(":memory:");
sqliteVec.load(db);

And Ruby:

require 'sqlite3'
require 'sqlite_vec'

db = SQLite3::Database.new(':memory:')
db.enable_load_extension(true)
SqliteVec.load(db)

In other words, the sqlite_vec.load(db), sqliteVec.load(db), and SqliteVec.load(db) lines abstract away the need for calling .load_extension() manually, mostly because finding the correct path to the compiled extension is hard for end users.

Would it be possible to include an API like that in this? So something like:

using Microsoft.Data.Sqlite;
using SqliteSample.Extension;
var connectionString = "Data Source=:memory:";
using var connection = new SqliteConnection(connectionString);
SqliteSample.load(connection);

Not 100% sure if that's a valid package name to include.

The current manual connection.LoadExtension("sample0") is fine, but it would require sample1/sample2 updates on major version bumps, which is a bit awkward. Also, I'm a bit wary of the "lookup other dylib in multiple directories" approach, as calling dlopen on any malicious dylib can be a security issue.

If that's not possible, maybe would could expose the libname as a variable?

using Microsoft.Data.Sqlite;
using SqliteSample.Extension;
var connectionString = "Data Source=:memory:";
using var connection = new SqliteConnection(connectionString);
connection.LoadExtension(SqliteSample.name); // 'sample0'

Again, not familiar with dotnet so open to other API suggestions!

@roji
Copy link

roji commented Mar 19, 2025

Would it be possible to include an API like that in this? So something like:

I think that in the .NET world, it could be reasonable to simply add an extension method over SqliteConnection to load sqlite_vec, so the user would write the following:

using var connection = new SqliteConnection(connectionString);
connection.LoadSqliteVec();

// The extension method definition:
public static class SqliteConnectionExtensions
{
    public static void LoadSqliteVec(this SqliteConnection connection)
        => connection.LoadExtension("sqlite_vec0"); // Or whatever the name of the binary is that the extension deploys
}

Makes sense @krwq @aaronpowell?

@aaronpowell
Copy link
Author

Yeah that'd work, but wouldn't we have to build a .NET binary on the CI of anything using this, thus adding some overhead to the consumer? Or maybe we could get away with a source only NuGet package but I've never tested that with runtimes in it like this (I guess it'd work)

@krwq
Copy link

krwq commented Mar 20, 2025

I'm not sure it's worth to add a library just to call connection.LoadExtension("vec0") especially we made it work that path is not needed.

I don't feel particularly strongly about having it or not but if we decide to have it then IMO extension method makes most sense and as @aaronpowell suggested I think this would make sense to be source-only package since dll is pretty heavy for just a single method.

@aaronpowell
Copy link
Author

I'll admit I made a mistake in my assumption that the DLL name has to match the package ID, I thought it was doing some lookup through the referenced packages, but that's wrong, the runtime folders are copied to the output on build and it looks there.

@roji
Copy link

roji commented Mar 20, 2025

I'm not sure it's worth to add a library just to call connection.LoadExtension("vec0") especially we made it work that path is not needed.

@krwq the main value I see in having the extension method is that the user doesn't have to embed a magic string (the name of the DLL), which is indeed not ideal... It's true that this seems heavy for an additional .NET DLL, but on the other hand that's presumably just an internal detail inside the nupkg that no user will actually need to know or care about (or am I missing something)?

I definitely don't think that telling the user to install yet another source-only thing just for the extension is great - I'd either go with an in-the-box extension method in the same nupkg, or just drop the whole thing and tell them to call LoadExtension("vec0").

But I don't have any strong feelings here - @krwq @aaronpowell @asg017 I'm OK with whatever you think is best.

@aaronpowell
Copy link
Author

I'll have a go at making it a source only package tomorrow

@asg017
Copy link
Owner

asg017 commented Mar 20, 2025

Ah, didn't realize that adding a C# wrapper would require compiling a binary. Dont think it's worth it then, can always update later if needed.

Let's go with this then - but one question about publishing:

Which registry/registries should I submit the sqlite-sample.0.0.1.nupkg file to? Is there a commonly-used Github Action people use to publish these?

@roji
Copy link

roji commented Mar 20, 2025

@asg017 yeah, the standard/public place for publishing .NET nuget packages is nuget.org - all .NET tools and IDEs look for packages there by default.

You can set up a Github Action in your pipeline if you want (I can help with samples), but at least for the start you can also just manually upload your package via the web UI (it's super easy). Naming-wise you could call it SqliteVec (it's not taken), or if you want to give it some prefix (e.g. your name or something) you can (e.g. XXXX.SqliteVec or whatever makes sense).

Once that's uploaded, anyone can start using the package by doing dotnet add package SqliteVec (docs).

@aaronpowell
Copy link
Author

Had a little play to make it include a C# file which will provide an extension method. Here's an example of what it would look like with the SQLite Vector extension:

var connectionString = "Data Source=:memory:";
using var connection = new SqliteConnection(connectionString);

connection.LoadSqliteVec();

connection.Open();
var command = connection.CreateCommand();
command.CommandText = "SELECT vec_version()";
var result = command.ExecuteScalar();
Console.WriteLine($"SQLite vec extension version: {result}");

It doesn't require any compilation in the CI process, as there is a "loose" C# file included in the NuGet package, and it gets picked up once the package is added to a project. Here's what that looks like in VS:
image

You can see the SqliteVec node in the project tree and I've expanded it out to show the method that was added. The generated C# code is:

namespace Microsoft.Data.Sqlite
{
    public static class SqliteSqliteVecExtensions
    {
        public static void LoadSqliteVec(this SqliteConnection connection)
          => connection.LoadExtension("vec0");
    }
}

To make this work, I've added a new property to the NuGet TargetSpec called friendly_name which is what is used to generate the method/class name, but if that's not provided it falls back to using the package ID with some basic formatting applied on it (but it won't always generate a "standard" C#-esq name).

Here's the NuGet file I used for testing:

sqlite-vec.0.0.0-alpha.3.zip

@krwq
Copy link

krwq commented Mar 24, 2025

@asg017 you can use dotnet nuget push to push your package. Use --help or search for examples how to use.

To get dotnet you have couple of options

@adamsitnik
Copy link

I've tried to use the package provided by @aaronpowell to run Semantic Kernel tests. Thanks to help from @krwq I was able to get it working.

FWIW the code is available here

cc @roji @jeffhandley

@roji
Copy link

roji commented Mar 26, 2025

Great news @adamsitnik, thanks for confirming! FYI @krwq's change to Microsoft.Data.Sqlite went was backported to 9.0 and 8.0, so when then next patch version is out we can modify the Microsoft.Extensions.VectorData to use all this stuff (that's microsoft/semantic-kernel#11155, just assigned to you as you've already done the work ;)).

@roji
Copy link

roji commented Apr 1, 2025

@asg017 just checking in, are you having any trouble with nuget.org or similar? Would be great to see this merged and published soon as we'd like to take a dependency on it etc.

@asg017
Copy link
Owner

asg017 commented Apr 8, 2025

Hey apologies for the delay - will take a 2nd look at this and hopefully merge in this week!

@roji
Copy link

roji commented Apr 8, 2025

Thank you @asg017! Please don't hesitate to ping us if you run into any trouble.

@sandyarmstrong
Copy link

Had a little play to make it include a C# file which will provide an extension method.

Personally, I don't see a lot of value to this extension method. It doesn't make anything more obvious, you still need to read the docs to know you need to call it, and it doesn't save any work really. I believe a .NET-specific docs page would be sufficient here.

Testing out the nupkg you linked, it broke my team's build by default because the extension class violates our code style rules. I don't think most .NET devs are aware that nupkg's can add source to a project, so I could see the presentation of this sort of error being pretty confusing.

If you think it's important to include the extension, I'd suggest adding a comment at the top explaining that it came from the nupkg, and that the file can be excluded from your project by adding using ExcludeAssets="contentFiles" in your PackageReference.

@roji
Copy link

roji commented Apr 11, 2025

Personally, I don't see a lot of value to this extension method. It doesn't make anything more obvious, you still need to read the docs to know you need to call it, and it doesn't save any work really. I believe a .NET-specific docs page would be sufficient here.

I disagree. The following:

connection.LoadSqliteVec();

... seems more discoverable, less error-prone and generally nicer than this:

connection.LoadExtension("sqlite_vec0");

LoadSqliteVec() can even be discovered via Intellisense - the moment one does connection., when they type the dot they get completion proposals and LoadSqliteVec() would be one of them. That really is much better than a magic sqlite_vec0 string, which nobody can ever guess and can only be copy-pasted from the docs.

Testing out the nupkg you linked, it broke my team's build by default because the extension class violates our code style rules.

What kind of style rules are you using? LoadSqliteVec() is standard .NET naming.

I don't think most .NET devs are aware that nupkg's can add source to a project, so I could see the presentation of this sort of error being pretty confusing.

I honestly don't see devs getting actually bit by this. We can always add a comment though.

@sandyarmstrong
Copy link

What kind of style rules are you using? LoadSqliteVec() is standard .NET naming

My team has many style rules that this file fails. For example, it's missing a copyright header and does not use file-scoped namespaces. But another team may have completely opposite style rules. You'd be surprised how many teams enforce style rules at build time.

@roji
Copy link

roji commented Apr 12, 2025

Thanks for the info @sandyarmstrong, that makes sense. A better way to deal with this may be to simply add // <auto-generated/> at the beginning - AFAIK that causes most analyzers/rule enforcers to disregard the file entirely.

@aaronpowell @krwq @adamsitnik what do you think? Is there some sort of better guidance for nuget-added source files?

@adamsitnik
Copy link

Thanks for the info @sandyarmstrong, that makes sense. A better way to deal with this may be to simply add // <auto-generated/> at the beginning - AFAIK that causes most analyzers/rule enforcers to disregard the file entirely.

@aaronpowell @krwq @adamsitnik what do you think? Is there some sort of better guidance for nuget-added source files?

I don't have anything better than // <auto-generated/>

@roji
Copy link

roji commented Apr 14, 2025

OK, I've made this suggestion which can be accepted to add <auto-generated/> to the included source file, this should take care of style/analyzer warnings. It makes sense - in a way this file is indeed auto-generated into the user's project.

@asg017 I think if you accept this suggestion everything should be ready for publishing the nuget.

@aaronpowell
Copy link
Author

Change from @roji applied.

I'm not able to generate a new nuget package for testing for a few more days, currently traveling and don't have access to the machine with everything setup for that.

@roji
Copy link

roji commented Apr 15, 2025

@asg017 I think we're good to go here - and time is really starting to get tight; hope you can merge this and publish the nuget soon!

@roji
Copy link

roji commented Apr 24, 2025

@asg017 any update here?

Comment on lines +229 to +231
public static class Sqlite{0}Extensions
{{
public static void Load{0}(this SqliteConnection connection)
Copy link

@adamsitnik adamsitnik May 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see jeffhandley#1 for details

Suggested change
public static class Sqlite{0}Extensions
{{
public static void Load{0}(this SqliteConnection connection)
/// <summary>
/// Utility for loading the SQLite vector extension.
/// </summary
public static class Sqlite{0}Extensions
{{
/// <summary>
/// Loads the {1} SQLite vector extension.
/// </summary>
/// <param name="connection">A connection to a SQLite database.</param>
public static void Load{0}(this SqliteConnection connection)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest adding a the on 230: Utility for loading the SQLite vector extension.

@jeffhandley
Copy link

This pull request should be closed. I will create a fresh pull request to replace it, as I ended up adding several additional commits into 0.0.1-alpha.19.nuget (jeffhandley/sqlite-dist) to get the NuGet package published as NuGet Gallery | sqlite-vec 0.1.7-alpha.2.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants