-
Notifications
You must be signed in to change notification settings - Fork 24
Description
We get a wrong infohash for this torrent:
https://academictorrents.com/details/6c690018c5786dbbb00161f62b0712d69296df97
rapppid-weights.tar-6c690018c5786dbbb00161f62b0712d69296df97.zip
We get this infohash: 8aa01a4c816332045ffec83247ccbc654547fedf
The problem is the info dictionary contains a custom field collections:
{
"collections": [
"org.archive.rapppid-weights.tar"
],
"files": [
{
"crc32": "57d33fcc",
"length": 11528324,
"md5": "e91bb4ba82695161be68f8b33ae76142",
"mtime": "1689273730",
"path": [
"RAPPPID Weights.tar.gz"
],
"sha1": "45970ef33cb3049a7a8629e40c8f5e5268d1dc53"
},
{
"crc32": "c658fd4f",
"length": 20480,
"md5": "a782b2a53ba49f0d45f3dd6e35e0d593",
"mtime": "1689273783",
"path": [
"rapppid-weights.tar_meta.sqlite"
],
"sha1": "bcb06b3164f1d2aba22ef6046eb80f65264e9fba"
},
{
"crc32": "8140a5c7",
"length": 1044,
"md5": "1bab21e50e06ab42d3a77d872bf252e5",
"mtime": "1689273763",
"path": [
"rapppid-weights.tar_meta.xml"
],
"sha1": "b2f0f2bbec34aa9140fb9ac3fcb190588a496aa3"
}
],
"name": "rapppid-weights.tar",
"piece length": 524288,
"pieces": "<hexhex>"
}I think we havo to change the function to calculate the infohash:
pub fn calculate_info_hash_as_bytes(&self) -> [u8; 20] {
let info_bencoded = ser::to_bytes(&self.info).expect("variable `info` was not able to be serialized.");
let mut hasher = Sha1::new();
hasher.update(info_bencoded);
let sum_hex = hasher.finalize();
let mut sum_bytes: [u8; 20] = Default::default();
sum_bytes.copy_from_slice(sum_hex.as_slice());
sum_bytes
}We can not simply bencode the TorrentInfo struct because it does not contain the custom fields:
pub struct TorrentInfo {
pub name: String,
#[serde(default)]
pub pieces: Option<ByteBuf>,
#[serde(rename = "piece length")]
pub piece_length: i64,
#[serde(default)]
pub md5sum: Option<String>,
#[serde(default)]
pub length: Option<i64>,
#[serde(default)]
pub files: Option<Vec<TorrentFile>>,
#[serde(default)]
pub private: Option<u8>,
#[serde(default)]
pub path: Option<Vec<String>>,
#[serde(default)]
#[serde(rename = "root hash")]
pub root_hash: Option<String>,
#[serde(default)]
}See: #242
@da2ce7 when we parse the torrent file, we do not extract the custom fields from the info dictionary. We have to use the complete info dictionary, not only the fields we are mapping into database table fields. Right now, I see these options:
- We add the raw info dictionary to the
AddTorrentRequest
pub struct AddTorrentRequest {
pub metadata: Metadata,
pub torrent: Torrent,
pub torrent_info_dict: Vec[u8], // We can wrap it with a RawInfoDict type
}
We can't use the Torrent::torrent.info_hash() method. We move the infohash calculation to the new type. We have to remove it. We do not store the RawInfoDict in the database. We only use it in the service to calculate the infohash.
- Same as 1, but we store the raw value for the
infodict in the database.
If we have other problems or protocol extensions in the future, we have the original value.
- Same as 2 but we store the full torrent file in binary format instead of storing only the info dictionary.
We can add a new table: torrust_raw_torrents.
I would implement the number 3 because:
- It can be very helpful to detect other bugs like this.
- In the future, we can add features that rely on fields we are not storing right now.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status