Do you use anything to archive content for yourself or others? (research, videos, articles, and anything that could be lost to time or censorship)

Otter@lemmy.ca · edit-2 14 days ago

Do you use anything to archive content for yourself or others? (research, videos, articles, and anything that could be lost to time or censorship)

yasser_kaddoura@lemmy.world · edit-2 13 days ago

I have a script that archives to:

I used to solely depend on archive.org, but after the recent attacks, I expanded my options.

Script: https://gist.github.com/YasserKa/9a02bc50e75e7239f6f0c8f04fe4cfb1

EDIT: Added script. Note that the script doesn’t include archiving to archivebox, since its API isn’t available in stable verison yet. You can add a function depending on your setup. Personally, I am depending on Caddy and docker, so I am using caddy module [1] to execute commands with this in my Caddyfile:

route /add {
	@params query url=*
	exec docker exec --user=archivebox archivebox archivebox add {http.request.uri.query.url} {
		timeout 0
	}
}

[1] https://github.com/abiosoft/caddy-exec

opulentocean@lemm.ee · 13 days ago

Would you be willing to share it?

yasser_kaddoura@lemmy.world · 13 days ago

Sure.

WhyJiffie@sh.itjust.works · 13 days ago

isn’t this prone to a

 || rm -rf /

or something similar at the end of the URL?

if you can docker exec, you have a lot of privileges already, so be sure to make sure this is not a danger

yasser_kaddoura@lemmy.world · edit-2 12 days ago

Thank you for the warning. You are correct. It’s prune to command injection. I will validate the URL before executing it. This shoud suffice until archivebox’s rest API is available in stable.