Skip to content

Conversation

flauschzelle
Copy link
Collaborator

Mostly adapted from the code that does the same thing in the convert_tweet method.

Still missing the part that saves the online paths for later lookup of better quality version, because I don't know how the best version online path would look for an image from a DM, it might be very different from the ones in a tweet (assuming this because the structure of the archive's JSON is also different for DMs and tweets).

I also wrote hints for this as TODO comments in the code. And I'm neither sure if DMs can even contain videos anyway, nor if I understood the video URL parsing part correctly... please review and test this. For me, it runs without errors - but this still shouldn't be merged in the current unfinished state.

Help with the missing part would be much appreciated :)

@flauschzelle
Copy link
Collaborator Author

I figured out the URL format for the original size images (still no idea about the videos). But they can't be retrieved (at least from the browser) without the correct cookie (which probably means being logged in as the archive's owner). Until that problem is solved, I think the URL should not even be added to the list of possible best-quality media downloads.

So I think this could be merged now (that is, if you find no need for changes unrelated to the best-quality media url part), because I think the cookie problem would get its own issue and PR anyway if it's even possible at all...

@timhutton
Copy link
Owner

timhutton commented Nov 24, 2022

Is there duplicated code here, between parse_direct_messages() and convert_tweet()? I wonder if first we need to refactor convert_tweet() into several smaller functions such that parse_direct_messages() can reuse them.

@flauschzelle
Copy link
Collaborator Author

Is there duplicated code here, between parse_direct_messages() and convert_tweet()? I wonder if first we need to refactor convert_tweet() into several smaller functions such that parse_direct_messages() can reuse them.

Currently, there is a little bit of duplicated code, but the JSON format of Tweets and DMs is so different that I think it would take a lot of abstraction (and thus, making the code even less readable than it already is) to make any extracted method usable in both cases. So I'm not sure if that would even be worth it.

@timhutton
Copy link
Owner

This change doesn't embed images, sadly, since on L596 the body is inside ```...```

A rendering of the markdown looks like this:

image

We could consider remove the backquotes. It would break some DMs that have code in them, which appear like this:

"messageCreate" : {
            "recipientId" : "15840592",
            "reactions" : [ ],
            "urls" : [ ],
            "text" : "kernel void rd_compute(global float *a_in,global float *a_out)\n{\n    // parameters:\n    const float time ...

We could replace \n with <space><space>\n, which makes a newline in markdown. Code blocks won't align any more (because the rendering won't be in a fixed-width font any more but I think that's the best we can do.

@timhutton
Copy link
Owner

But let's get this change in and we can fix that issue later if it helps unblock things.

@timhutton
Copy link
Owner

It makes sense that images in DMs require the user's login to retrieve. We aren't going to fix that any time soon. I'd be tempted to remove the download URL and the comment about it.

@flauschzelle
Copy link
Collaborator Author

This change doesn't embed images, sadly, since on L596 the body is inside ...

A rendering of the markdown looks like this:

image

We could consider remove the backquotes. It would break some DMs that have code in them, which appear like this:

"messageCreate" : {
            "recipientId" : "15840592",
            "reactions" : [ ],
            "urls" : [ ],
            "text" : "kernel void rd_compute(global float *a_in,global float *a_out)\n{\n    // parameters:\n    const float time ...

We could replace \n with <space><space>\n, which makes a newline in markdown. Code blocks won't align any more (because the rendering won't be in a fixed-width font any more but I think that's the best we can do.

Agreed :)

I did something like this (the newlines in markdown thing) for Tweets in #123 and I think the look of the DMs text body output could be changed to the same format at the Tweets. (I was actually waiting for this here to be merged so I could apply that to DMs anyway, as I said in #123)

I think the fixed width is not more necessary for DMs than it would be for Tweets, and the Tweets md output looks fine without it :) ... on Twitter's own frontends, code snippets would also not be displayed with correct indents and a monospaced font, so I think people are used to that anyway :)

@flauschzelle flauschzelle merged commit 8a3737c into timhutton:main Nov 26, 2022
flauschzelle added a commit that referenced this pull request Nov 27, 2022
Update README according to #116, #118 (DMs), #126 and #136 (CLI instructions)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants