Interesting. You'd end up with a lot of objects with that approach and eventually it would be too big for the event size. I thought about doing it with storing packs in blossom. Here is my code to play with that idea. I would have made it into a POC if rust-nostr had blossom support at the time. It does now. It turns out that having a git server is way more flexible so ngit.dev/grasp came to be. Let git be git and let nostr be nostr.
DanConwayDev's avatar DanConwayDev
From 6bcb58925ad5a7ec2421718fb2996add9080f7bc Mon Sep 17 00:00:00 2001 From: DanConwayDev <DanConwayDev@protonmail.com> Date: Fri, 15 Nov 2024 11:57:10 +0000 Subject: [PATCH] feat(blossom): blossom as remote using packs This is a WIP exploration of the use of blossom as an optional alternative to using a git server. The incomplete code focuses on how blossom could fit with nip34 to most efficently replace the git server. It is missing the actual blossom interaction which would hopefully would be facilited by a new blossom feature in rust-nostr. This implementation tries to minimise the number of blobs required for download by using packs. If a branch tip is at height 1304 it will split the commits in into a number of packs. a pack the first 1024 commits, the next 256, the next 16 and the final 8. I planned for the identification of blossom servers to mirror the approach taken for relays: 1. list repository blossom servers in repo announcement event kind 30617 2. also push to user blossom servers in the standard event for that This is not implemented, along with the rest of the blossom aspects. I'm publishing this now as @npub1elta...cume has recently published a POC of an alternative approach and it makes sense to this alternative idea. --- Cargo.lock | 1 + Cargo.toml | 1 + src/bin/git_remote_nostr/fetch.rs | 4 ++++ src/bin/git_remote_nostr/list.rs | 23 ++++++++++++++++++++++- src/bin/git_remote_nostr/push.rs | 124 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----- src/lib/repo_state.rs | 17 ++++++++++++++++- 6 files changed, 163 insertions(+), 7 deletions(-) diff --git a/Cargo.lock b/Cargo.lock index b20b60a..72b37a2 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -1805,6 +1805,7 @@ dependencies = [ "serde_json", "serde_yaml", "serial_test", + "sha2", "test_utils", "tokio", "urlencoding", diff --git a/Cargo.toml b/Cargo.toml index ed99aea..320a9f0 100644 --- a/Cargo.toml +++ b/Cargo.toml @@ -38,6 +38,7 @@ serde_yaml = "0.9.27" tokio = "1.33.0" urlencoding = "2.1.3" zeroize = "1.6.0" +sha2 = "0.10.8" [dev-dependencies] assert_cmd = "2.0.12" diff --git a/src/bin/git_remote_nostr/fetch.rs b/src/bin/git_remote_nostr/fetch.rs index a972a2f..a1116c5 100644 --- a/src/bin/git_remote_nostr/fetch.rs +++ b/src/bin/git_remote_nostr/fetch.rs @@ -49,6 +49,10 @@ pub async fn run_fetch( let term = console::Term::stderr(); for git_server_url in &repo_ref.git_server { + if git_server_url.eq("blossom") { + // TODO download missing blobs + continue; + } let term = console::Term::stderr(); if let Err(error) = fetch_from_git_server( git_repo, diff --git a/src/bin/git_remote_nostr/list.rs b/src/bin/git_remote_nostr/list.rs index 92faa6b..d71c2d1 100644 --- a/src/bin/git_remote_nostr/list.rs +++ b/src/bin/git_remote_nostr/list.rs @@ -43,7 +43,28 @@ pub async fn run_list( let term = console::Term::stderr(); - let remote_states = list_from_remotes(&term, git_repo, &repo_ref.git_server, decoded_nostr_url); + let mut remote_states = list_from_remotes( + &term, + git_repo, + &repo_ref + .git_server + .iter() + // blossom will always match nostr state + .filter(|s| !s.starts_with("blossom")) + .map(std::borrow::ToOwned::to_owned) + .collect::<Vec<String>>(), + decoded_nostr_url, + ); + if repo_ref.git_server.iter().any(|s| s.eq("blossom")) { + if let Some(nostr_state) = nostr_state.clone() { + remote_states.insert("blossom".to_owned(), nostr_state.state.clone()); + } else if let Some((_, state)) = remote_states.iter().last() { + remote_states.insert("blossom".to_owned(), state.clone()); + } else { + // create blank state if no nostr state exists yet + remote_states.insert("blossom".to_owned(), HashMap::new()); + } + } let mut state = if let Some(nostr_state) = nostr_state { for (name, value) in &nostr_state.state { diff --git a/src/bin/git_remote_nostr/push.rs b/src/bin/git_remote_nostr/push.rs index db86c04..a12e8ba 100644 --- a/src/bin/git_remote_nostr/push.rs +++ b/src/bin/git_remote_nostr/push.rs @@ -2,6 +2,7 @@ use core::str; use std::{ collections::{HashMap, HashSet}, io::Stdin, + str::FromStr, sync::{Arc, Mutex}, time::Instant, }; @@ -11,7 +12,7 @@ use auth_git2::GitAuthenticator; use client::{get_events_from_cache, get_state_from_cache, send_events, sign_event, STATE_KIND}; use console::Term; use git::{sha1_to_oid, RepoActions}; -use git2::{Oid, Repository}; +use git2::{Buf, Commit, Oid, Repository}; use git_events::{ generate_cover_letter_and_patch_events, generate_patch_event, get_commit_id_from_patch, }; @@ -29,11 +30,17 @@ use ngit::{ }; use nostr::nips::nip10::Marker; use nostr_sdk::{ - hashes::sha1::Hash as Sha1Hash, Event, EventBuilder, EventId, Kind, PublicKey, Tag, + hashes::{ + hex::DisplayHex, + sha1::Hash as Sha1Hash, + sha256::{self, Hash as Sha256Hash}, + }, + Event, EventBuilder, EventId, Kind, PublicKey, Tag, }; use nostr_signer::NostrSigner; use repo_ref::RepoRef; use repo_state::RepoState; +use sha2::{Digest, Sha256}; use crate::{ client::Client, @@ -74,7 +81,17 @@ pub async fn run_push( let list_outputs = match list_outputs { Some(outputs) => outputs, - _ => list_from_remotes(&term, git_repo, &repo_ref.git_server, decoded_nostr_url), + _ => list_from_remotes( + &term, + git_repo, + &repo_ref + .git_server + .iter() + .filter(|s| !s.eq(&"blossom")) + .map(std::string::ToString::to_string) + .collect(), + decoded_nostr_url, + ), }; let nostr_state = get_state_from_cache(git_repo.get_path()?, repo_ref).await; @@ -150,11 +167,24 @@ pub async fn run_push( } } + let mut blossom_packs: Option<HashMap<sha256::Hash, Buf>> = None; if !git_server_refspecs.is_empty() { let new_state = generate_updated_state(git_repo, &existing_state, &git_server_refspecs)?; + let blossom_hashes = if repo_ref.git_server.contains(&"blossom".to_string()) { + let (blossom_hashes, packs) = create_blossom_packs(&new_state, git_repo)?; + blossom_packs = Some(packs); + blossom_hashes + } else { + HashSet::new() + }; - let new_repo_state = - RepoState::build(repo_ref.identifier.clone(), new_state, &signer).await?; + let new_repo_state = RepoState::build( + repo_ref.identifier.clone(), + new_state, + blossom_hashes, + &signer, + ) + .await?; events.push(new_repo_state.event); @@ -325,6 +355,13 @@ pub async fn run_push( // TODO make async - check gitlib2 callbacks work async + if let Some(packs) = blossom_packs { + // TODO: upload blossom packs + for (_hash, _pack) in packs { + // blossom::upload(pack) + } + } + for (git_server_url, remote_refspecs) in remote_refspecs { let remote_refspecs = remote_refspecs .iter() @@ -863,6 +900,71 @@ fn generate_updated_state( Ok(new_state) } +fn create_blossom_packs( + state: &HashMap<String, String>, + git_repo: &Repo, +) -> Result<(HashSet<sha256::Hash>, HashMap<sha256::Hash, Buf>)> { + let mut blossom_hashes = HashSet::new(); + let mut blossom_packs = HashMap::new(); + for commit_id in state.values() { + if let Ok(oid) = Oid::from_str(commit_id) { + if let Ok(commit) = git_repo.git_repo.find_commit(oid) { + let height = get_height(&commit, git_repo)?; + let mut revwalk = git_repo.git_repo.revwalk()?; + revwalk.push(oid)?; + let mut counter = 0; + for pack_size in split_into_powers_of_2(height) { + let mut pack = git_repo.git_repo.packbuilder()?; + while counter < pack_size { + if let Some(oid) = revwalk.next() { + pack.insert_commit(oid?)?; + counter += 1; + } + } + let mut buffer = Buf::new(); + pack.write_buf(&mut buffer)?; + let hash = buffer_to_sha256_hash(&buffer); + blossom_hashes.insert(hash); + blossom_packs.insert(hash, buffer); + counter = 0; + } + } + } + } + Ok((blossom_hashes, blossom_packs)) +} + +fn get_height(commit: &Commit, git_repo: &Repo) -> Result<u32> { + let mut revwalk = git_repo.git_repo.revwalk()?; + revwalk.push(commit.id())?; + Ok(u32::try_from(revwalk.count())?) +} + +fn split_into_powers_of_2(height: u32) -> Vec<u32> { + let mut powers = Vec::new(); + let mut remaining = height; + + // Decompose the height into powers of 2 + for i in (0..32).rev() { + let power = 1 << i; // Calculate 2^i + while remaining >= power { + powers.push(power); + remaining -= power; + } + } + + powers +} + +fn buffer_to_sha256_hash(buffer: &Buf) -> sha256::Hash { + let mut hasher = Sha256::new(); + hasher.update(buffer.as_ref()); + let hash = hasher + .finalize() + .to_hex_string(nostr_sdk::hashes::hex::Case::Lower); + sha256::Hash::from_str(&hash).unwrap() +} + async fn get_merged_status_events( term: &console::Term, repo_ref: &RepoRef, @@ -1186,6 +1288,7 @@ trait BuildRepoState { async fn build( identifier: String, state: HashMap<String, String>, + blossom: HashSet<Sha256Hash>, signer: &NostrSigner, ) -> Result<RepoState>; } @@ -1193,6 +1296,7 @@ impl BuildRepoState for RepoState { async fn build( identifier: String, state: HashMap<String, String>, + blossom: HashSet<Sha256Hash>, signer: &NostrSigner, ) -> Result<RepoState> { let mut tags = vec![Tag::identifier(identifier.clone())]; @@ -1202,10 +1306,20 @@ impl BuildRepoState for RepoState { vec![value.clone()], )); } + if !blossom.is_empty() { + tags.push(Tag::custom( + nostr_sdk::TagKind::Custom("blossom".into()), + blossom + .iter() + .map(std::string::ToString::to_string) + .collect::<Vec<String>>(), + )); + } let event = sign_event(EventBuilder::new(STATE_KIND, "", tags), signer).await?; Ok(RepoState { identifier, state, + blossom, event, }) } diff --git a/src/lib/repo_state.rs b/src/lib/repo_state.rs index c3a7606..19e78b6 100644 --- a/src/lib/repo_state.rs +++ b/src/lib/repo_state.rs @@ -1,11 +1,17 @@ -use std::collections::HashMap; +use std::{ + collections::{HashMap, HashSet}, + str::FromStr, +}; use anyhow::{Context, Result}; use git2::Oid; +use nostr_sdk::hashes::sha256::Hash; +#[derive(Clone)] pub struct RepoState { pub identifier: String, pub state: HashMap<String, String>, + pub blossom: HashSet<Hash>, pub event: nostr::Event, } @@ -14,6 +20,7 @@ impl RepoState { state_events.sort_by_key(|e| e.created_at); let event = state_events.first().context("no state events")?; let mut state = HashMap::new(); + let mut blossom = HashSet::new(); for tag in event.tags.iter() { if let Some(name) = tag.as_slice().first() { if ["refs/heads/", "refs/tags", "HEAD"] @@ -26,6 +33,13 @@ impl RepoState { } } } + if name.eq("blossom") { + for s in tag.clone().to_vec() { + if let Ok(hash) = Hash::from_str(&s) { + blossom.insert(hash); + } + } + } } } Ok(RepoState { @@ -35,6 +49,7 @@ impl RepoState { .context("existing event must have an identifier")? .to_string(), state, + blossom, event: event.clone(), }) } -- libgit2 1.8.1
View quoted note →

Replies (6)

you could use Go and i already have a second draft blossom server written in go. i didn't write it. claude spun it up in about 3 hours and then another hour fixing it and i just haven't tested it yet. i know it works because it's just http and the tests pass, and i saw it accepting, and allowing me to delete a random blob several times. i just haven't used it. probably will already serve you with blossom. imma make sure you both have permissions in case you want to try it
my take on this is look into techniques used in computer games. i remember when GTA3 came out, and its most epic achievement was loading free inter-map transit. still very few games use this but it's a graph theory algorithm. this is the kind of thing you need to automatically, and quickly partition a map of related data. you need metrics of proximity and some kind of parameters for partitioning the map to fit the compute you need to do. it's not hard. but it may take a while to wrap your head around it. but graphs at high node count are N! style compute cost. so it only takes like 3 or 4 deep and you are practically at infinity as far as even the most powerful computers can do in milliseconds.
yeah youd bloat the event with every object ref as the repo grows. not a great design but a fun poc. i wrote my poc in go. the code is actually hosted on itself, as the poc is a relay/blossom/webui all in one binary server. i also wrote my own git-nostr-remote for the client side. it was a fun hack and generally works for the happy path. no planning to pursue it. i can share the code if you’re interested in it.
The problem is git enables many feature like shallow and parse cloning, packing specific object and data, getting specific files and git logs, etc. These are all battle tested on solid git server implementations. This is all not possible trying to reinvent a simplified git server with blossom.
Issues and PRs (kinds 9803/9804) are automatically published to nostr on handled status changes (merged, closed and reopened). I fetch them from source if possible on import of the repo and try to aggregate those by their timestamps with the nostr kinds. If source is lets say Github im not upstreaming the edits additionally there so far. Anyway still needs polish in finding these kinds better and flows are surely not the endgame, but what i went with so far 🤓