Duplicate files Plain Text
23:35:28 ~copypaste i'd use global hash based storage to save bandwidth 23:35:48 StephenLynx when storing files you would have to save them on wherever you wish instead of mongo 23:35:52 ~copypaste ok, i see, lynxchan also doesn't use global hash storage like next does 23:35:57 ~copypaste i didn't check that 23:36:10 StephenLynx when serving you would have to check if the file is a mediafile and serve from your previously storaged method. 23:36:28 StephenLynx the final check for serving wouldn`t be too hard. 23:36:31 ~copypaste sure 23:36:37 StephenLynx since there is a clear pattern for media files. 23:36:47 StephenLynx /board/media/file 23:37:00 ~copypaste so it would take some doing to get the media how i want 23:37:00 StephenLynx thumbs live side by side with the media they represent 23:37:03 ~copypaste probably as much as changing vichan 23:37:10 ~copypaste that's what it sounds like at least 23:37:16 StephenLynx I don`t know how much work is to fix vichan 23:37:22 StephenLynx but this change wouldn`t be too hard. 23:37:29 StephenLynx since it wouldn`t require changing how lynxchan work. 23:37:34 ~copypaste yeah but while you were explainign it you let me know a bigger issue 23:37:41 StephenLynx which is 23:37:46 ~copypaste it uses unix time storage and not hash based 23:37:51 ~copypaste two of the same file don't refer to the same file 23:37:51 StephenLynx hm? 23:38:01 ~copypaste it's wasteful of resources, both bandwidth and disk 23:38:04 StephenLynx what exactly do you mean? 23:38:06 ~copypaste mind infinity has the same problem 23:38:13 Jesus unix time storage was ineffective even back when /b/ had less than 10M posts 23:38:22 Jesus 6-8MGETs was lost that way 23:38:25 ~copypaste yep 23:38:29 StephenLynx using unix time to name files? 23:38:32 StephenLynx are you talking about that? 23:39:06 ~copypaste on next, when a file is uploaded, it's hashed and stored based on its hash. if the same file is uploaded again by another user, it isn't stored a second time 23:39:15 StephenLynx oh, that. 23:39:21 ~copypaste this saves us bandwidth, because it can come from a user's cache, and it stores us hard disk space, as people posting the same meme over and over don't fill the disk 23:39:40 Jesus most chan images are reposted memes anyway 23:39:48 StephenLynx it can be enabled per-board though. 23:39:57 DeepBlueSea [16:32:27] <~copypaste> i know it happens to all DBs 23:40:00 DeepBlueSea it doesnt happen to me! 23:40:05 ~copypaste hehehe 23:40:05 StephenLynx how often people post the exact same file twice? 23:40:10 Jesus nvm he said that 23:40:14 ~copypaste more than you'd think, ESPECIALLY with webm thread 23:40:22 ~copypaste actually i wnated this issue specifically for webm threads 23:40:27 ~copypaste it would save us so much bandwidth and disk space 23:40:32 ~copypaste jpegs don't really matter as much 23:40:39 StephenLynx I see. 23:41:11 ~copypaste this could also help endchan too 23:41:17 ~copypaste you can do more with the same disk space 23:41:36 StephenLynx would preventing the same file to be uploaded globally cut it? 23:41:48 StephenLynx preventing the whole post altogether? 23:42:00 Jesus webm threads on /pol/ are usually the same as the lsat ones 23:42:00 ~copypaste i don't think so 23:42:02 ~copypaste yeah 23:42:05 StephenLynx I figured. 23:42:06 ~copypaste you should allow it in 23:42:09 ~copypaste but refer to the same file 23:42:12 ~copypaste it should be transparent to users 23:42:21 StephenLynx ok, I will put that on the 1.6 roadmap. 23:42:32 Jesus you could retain files after they're dead of old age if you wanted, so they're there even before it gets uploaded 23:42:34 StephenLynx thats too big to get into 1.5 at this point. 23:42:36 ~copypaste cool. this is one of the reasons i stay behind next so much 23:42:41 ~copypaste it's a killer feature 23:42:45 StephenLynx got it.