Aller au contenu

Photo

The Vault Preservation Project


  • Veuillez vous connecter pour répondre
161 réponses à ce sujet

#76
Rolo Kipp

Rolo Kipp
  • Members
  • 2 791 messages
<smiling benignly...>

@ Lovelamb: WooHOO! *dances like a goat in spring* My Favorite new evil diva! Thank you!
I also seriously want to see my stories in action and try to balance all those shining white do-gooders with a proper pinch of... <smut>
I was going to say soot, bird. I'm talking here. <flirting>
In front of Pen? Heavens forfend! <her knives *are* still sharp, i'd think>
which brings me to...

@ Pen: Dearheart! How *are* you?! What you been up to? *makes sure the exit is clear*
Where's my feathered cloak? <what he said, giggles>

<...at the ladies>

Modifié par Rolo Kipp, 03 octobre 2012 - 05:09 .


#77
Rolo Kipp

Rolo Kipp
  • Members
  • 2 791 messages
<thinking long and...>

At this point I'd like to state an opinion. <pontificating again, old man?>
Er, no. I just want it very clear this is only my opinion :-P <heh>
There is a fabulous wealth of modules on the Vault and I love the Vault and I hope people continue to make their modules available on the Vault. <but?>
But, I really think the Nexus and (to somewhat lesser degree) the ModDB do a better job (and getting better) of collecting modules. This opinion is colored, or rather *not* colored, from experience. I haven't tried to upload anything to either place, so I don't know that side of things.

What I'm getting around to (I *like* beating around bushes!) is that there's no place like the Vault for classic modules, but the newer options seem to me to be far more viable for new releases.

If we do the VPP right, my opinion could very well change, but I thought I'd mention it here, anyway :-P Bottom line is that if I was to release a mod... <you mean finish it first, right?>
...I'd post it to the Vault first, but I'd also post it on the Nexus.

Then I'd look at the D/L figures and see where people are grabbing it. =) Analyse, modify, test. Repeat.

<...hard>

#78
acomputerdood

acomputerdood
  • Members
  • 219 messages

Rolo Kipp wrote...
@ meaglyn: But that is what I want! :-P The key value metadata, that is... preferably in CSV or Excel format. 

Would you be willing to share with ACD and incorporate that? He's sent me one updated version, why not another =)


it's probably not worth trying to combine the two systems.  i doubt meaglyn wrote hers in perl, so then either i'd be stuck trying to port her code or she would have to finish up her tool to grab everything.

i'd wager it won't be much harder for her to iterate through each project directory that's already been archived and build her metadata off of that.  probably slightly easier if i've already pulled down and broken out all the pieces needed.

i'll PM her the email i sent to you explaining everything if she wants to pick up the post-processing.

Getting the metadata into an easily imported format would make things vastly easier. I'd then use that CSV file to generate the projects. Then all I need to do is link up the files/screenies and comments.


is there a standard format the metadata needs to be in to import into your database?  or will your database read in the metadata based on the format the projects are in?  i guess whomever develops the system first gets to determine the format.  :)

Actually, comments could be collected in a keyed file, also. Drupal gives each comment its own node and links the nodes to the project. So I'd just need a field for each comment with the unique identifier for the project... I think... :-P


yeah, completely lost me there.  hopefully somebody else knows what rolo's rambling about and can make it happen.

#79
meaglyn

meaglyn
  • Members
  • 808 messages

it's probably not worth trying to combine the two systems.  i doubt meaglyn wrote hers in perl, so then either i'd be stuck trying to port her code or she would have to finish up her tool to grab everything.


It's actually "his" :)  The picture just fit the story I'm working on. Needs a half-elf female and I was using Raptre Thanlis for all my testing so that seemed like the right picture at the time given the few default choices. Just haven't gotten around to finding and loading a custom picture...

Anyway, it's partly perl. But I ended up using jsoup and java to parse the HTML because it was good at handling
incomplete html. Many of the other HTML parsers I found wanted the whole page but there's so much noise there
I just used perl to strip down to only the main part. 

It should not be too difficult to rework what I have to post-process the downloaded stuff produced by your script.
I'll watch for your pm. It'll be cleaner and easier to maintain to do it post anyway.

Rolo, we should talk about what format you want the final output. For now I'll continue working to an intermediate
format which we can either modify to suit or jsut translate to a "final" format when we know what that is.

Cheers,

Meaglyn

Modifié par meaglyn, 03 octobre 2012 - 07:31 .


#80
acomputerdood

acomputerdood
  • Members
  • 219 messages

meaglyn wrote...

It's actually "his" :)


nope, according to Internet Rule #35771, somebody with a female avatar shall be know and treated as a girl.

sorry :(

#81
Rolo Kipp

Rolo Kipp
  • Members
  • 2 791 messages
<looking optimistically...>

@ Meaglyn: Yes, we should :-) I'll have a lot more time for this on Sunday, though. I'm sneaking NwN bit in between paying work and can't dig into it right now.

I'll most likely be using the Migrate module for drupal and will look at that for the format needed:

The migrate module provides a flexible framework for migrating content into Drupal from other sources (e.g., when converting a web site from another CMS to Drupal). Out-of-the-box, support for creating core Drupal objects such as nodes, users, files, terms, and comments are included - it can easily be extended for migrating other kinds of content.

Primarily, I'm probably looking at a Comma Separated Value file with the first row being the column (field) names. The fields needed vary with project type

@ ACD: I just want the comment metadata to include a field with the name of whatever project it belongs to ;-P But I'm picking nits :-P

<...harried>

#82
Bannor Bloodfist

Bannor Bloodfist
  • Members
  • 929 messages
Note: I hate this site.  Dang thing has 3 times in a row, completely lost a post I have made on this.

Rolo Kipp wrote...

<thinking long and...>

At this point I'd like to state an opinion. <pontificating again, old man?>
Er, no. I just want it very clear this is only my opinion :-P <heh>
There is a fabulous wealth of modules on the Vault and I love the Vault and I hope people continue to make their modules available on the Vault. <but?>
But, I really think the Nexus and (to somewhat lesser degree) the ModDB do a better job (and getting better) of collecting modules. This opinion is colored, or rather *not* colored, from experience. I haven't tried to upload anything to either place, so I don't know that side of things.


Great idea, but absolutely not workable.

Nexus absolutely forbids posting/re-posting of some other author's works.  Check their EULA for that, but I think it is fairly prominent in other locations as well.

As a team working to provide a "safe haven/backup" of the vault, that is one job.  Re-publishing to another site, well, that opens that nasty can of worms regarding copyrights and we all know that the worms involved have a tendency to exponentially mutliply once various opinions get involved.  There are copyrights, fair usage rights, etc, none of which are lost/broken by posting something onto a site, regardless of the EULA of that particular site.  Yet to be tested in court, but easily found in written copyright laws.

Sure wish we could just get permission from all the authors with a single mass email attempt or something, but we all know that is not enough of an attempt to contact folks.  And we all know that email addresses on 90% of the existing vault content is no longer valid, yet many of those various authors are still around in some fashion, sometimes with new nicks, sometimes with old nicks, sometimes just watching, and sometimes still contributing in some fashion.

Irregardless of copyright issues, we would have to due our due dilligence to attempt to contact author's of the various projects, and then wait 5 years (is it 5 or 10?) before any given project could be considered abandonware and thus free from copyright?  And according to wikipedia, using a partial quote of the first page of data regarding the abandonware issue: "In most cases, software classed as abandonware is not in the public domain, as it has never had its original copyright revoked and some company or individual still owns exclusive rights. Therefore, sharing of such software is usually considered copyright infringement, though in practice copyright holders rarely enforce their abandonware copyrights."

Surely, this is not something that this project was intended to do.

<snip>
If we do the VPP right, my opinion could very well change, but I thought I'd mention it here, anyway :-P Bottom line is that if I was to release a mod... <you mean finish it first, right?>
...I'd post it to the Vault first, but I'd also post it on the Nexus.
Then I'd look at the D/L figures and see where people are grabbing it. =) Analyse, modify, test. Repeat.
<...hard>


A great idea to cross-post any NEW works of your own to both locations.  

Modifié par Bannor Bloodfist, 03 octobre 2012 - 09:19 .


#83
Rolo Kipp

Rolo Kipp
  • Members
  • 2 791 messages
<grumping...>

I seem to be on a roll for not saying what I mean. :-P

Rolo thought he said...
there's no place like the Vault for classic modules, but the newer options seem to me to be far more viable for new releases.


I meant the last bit - that people should multiple-post new stuff and that the old stuff should be let sit on the Vault (and the VPP).

<...just 'cause he can>

#84
Bannor Bloodfist

Bannor Bloodfist
  • Members
  • 929 messages
Yep, and that aspect I directly agreed with above... last line of text in the post.

#85
Tarot Redhand

Tarot Redhand
  • Members
  • 2 679 messages
In my wanderings around the net I have stumbled on something that might be of use to this project. It is called Spider.NET.1.4. To quote the help file :-

Quote

Spider is a .NET application which crawls websites and saves content and links to a Microsoft SQL Server Database.

End Quote

I would love to say it is a wonderful program but <sheepish grin> I have run into a couple of little problems. The first of which is I can't remember which precise website I found it on (Codeplex, SourceForge or Planet Source Code). The other is it requires a database called spider to be created before it can be used and as I haven't used sql for <mumbles under breath> years, I don't know how to do that.

In the hopes that somone else can get it working and then evaluate its usefulness or otherwise I have uploaded it to my dosbox/public folder. The complete package is in a plain ol' zip file to be found here.

TR

Modifié par Tarot Redhand, 03 octobre 2012 - 11:26 .


#86
Rolo Kipp

Rolo Kipp
  • Members
  • 2 791 messages
<pulls out his...>

Hmmm... on codeplex?

Edit: At first scan, it does generically what ACD has put together to do specifically. That is, I don't think there'd be any advantage to spider.net over my experience with wget. Both simply grab too much stuff. What ACD & Meaglyn are doing is processing the crawl and only pulling in the project stuff.

But good looking out, mate! :-)

<...spider net and tries to catch one>

Modifié par Rolo Kipp, 03 octobre 2012 - 11:45 .


#87
Tarot Redhand

Tarot Redhand
  • Members
  • 2 679 messages
That was my main thought but I couldn't be quite sure. I do know there is a readme in there with an email if the author needs to be contacted.

Yes Just followed your link Rolo and that is the one.

TR

Modifié par Tarot Redhand, 03 octobre 2012 - 11:32 .


#88
kamal_

kamal_
  • Members
  • 5 253 messages

Lovelamb wrote...

Sir, did you say the Vault is now read-only? I've devoted over a year of my recent life to working on an evil module that I doubt the Nexus, with their strict rules, would accept... (Should I kill myself for being late? :()

I would like to help with backuping the Vault, though I might need an explanation as to how to upload the content to your site. I can save all the web pages and related files for now. You can sign me up for the first 10 pages (or 250 modules) on the module list. I'm not sure how the modules are ordered, hope everyone sees the same list. I have the disk space, but my upload speed isn't very high.

off topic, but the Nexus doesn't seem to have a problem with evil content. My evil campaign for nwn2 is up there. You can sell children into slavery and kill completely innocent people, among other not nice things.

#89
meaglyn

meaglyn
  • Members
  • 808 messages

Rolo Kipp wrote...

<looking optimistically...>

@ Meaglyn: Yes, we should :-) I'll have a lot more time for this on Sunday, though. I'm sneaking NwN bit in between paying work and can't dig into it right now.

I'll most likely be using the Migrate module for drupal and will look at that for the format needed:

The migrate module provides a flexible framework for migrating content into Drupal from other sources (e.g., when converting a web site from another CMS to Drupal). Out-of-the-box, support for creating core Drupal objects such as nodes, users, files, terms, and comments are included - it can easily be extended for migrating other kinds of content.


I'll take a look at that.

Primarily, I'm probably looking at a Comma Separated Value file with the first row being the column (field) names. The fields needed vary with project type


CSV will be a little tricky what with the free text description and comment fields.

A kay/value dictionary allows the fields to be sort of self describing and we can use the same code for all the project types in theory.  I'll see what the migrator can use...

Another good thing about doing as a post-processor is we've got all the original data and can simply
re-process things as we refine what it needs to look like.


Cheers,
Meaglyn

#90
Rolo Kipp

Rolo Kipp
  • Members
  • 2 791 messages
<reading a bit...>

Well, I'm not married to the CSV (that was a leftover from a migration I did about 18 months ago, btw).

Here's another Migrate quote :-)

Features:

  • An object-oriented architecture, allowing default behavior to be extended and/or overridden.
  • Built-in support for PDO (DBTNG), XML, CSV, JSON, and native MSSQL and Oracle API sources; extendable for other sources.
  • Built-in support for node, user, taxonomy term, comment, and file destinations; extendable for other destinations entities and fields.
  • Map tables maintain relationships between source data and the resulting Drupal objects.
  • Import operations can be rolled back, allowing simple trial-and-error development of migration processes.
  • Tools for managing dependencies between migrated content.
  • Automated management of memory usage and framework for performance logging.
  • A web UI supporting collaboration between stakeholders and implementors.

Perhaps xml or json would be less tricky?

<...on the side>

Modifié par Rolo Kipp, 04 octobre 2012 - 05:45 .


#91
Rolo Kipp

Rolo Kipp
  • Members
  • 2 791 messages
 <flashing back...>

Do you remember the start of the music video for one of the Eurythmics' songs? A very tall man walks up and says "Not wanting to waste your valuable time..."

Heh.

I'm at that point here, thanks to Acomputerdude :-P The perl script he wrote is now quietly churning away in the background as I write (It's taken all week, but I've so far grabbed up to #7800 (out of 8170-something) hakpaks.

I'd like to say that I am enourmously gratified at the willing effort so many people have put in to preserving this content, and, really, none of that effort is wasted. Redundancy is crucial in such a fluid community and just *trying* to help really shows what a great heart you people have.

I'd also like to specifically point out the *huge* effort Michael darkangel has made - all while continuing to update and bug-fix the TileSet Creator and Nw(g)Max!

The next step (while continuing to harvest projects) is to tweak Meaglyn's metadata formatter to prep for migrating the projects into drupal.

After the projects are migrated, then I'll be really diving into all those suggestions by Bannor, et al ;-) That's when we'll work out the look and feel of the site.  

I promise, no disco balls.

(Edit: reference

This video, entitled the "Sweet Dreams Video Album," was originally broadcast on MTV and partly filmed at a night club called "Heaven" in London in May 1983. It features Annie and Dave with a relatively new band line-up... 

...Reportedly the video was released against Dave and Annie's wishes as they weren't pleased with the performance - Annie had been ill and the new band did not have enough rehearsal time.    

The video opens with a surreal scene involving a music agent (Norman Bacon) being confronted by a tall man (Stephen Calcutt) who looks like he is straight out of the Black Lodge in Twin Peaks. The video then segues into the "concert" at Heaven, featuring Annie and Dave in a very eighties environment - full of pulsating disco balls, laser beams, lots and lots of fat analogue synth sounds, and big bouffy frizzy hairstyles. Dave looks like he's in another world as usual and Annie looks striking in her two piece suit complete with garters.

because someone asked ;-)

<...to the eighties>

Modifié par Rolo Kipp, 20 octobre 2012 - 04:57 .


#92
Bannor Bloodfist

Bannor Bloodfist
  • Members
  • 929 messages
Gawd, I knew there was a decade that I hated... you just had to remind me about Disco :sick: didntcha?

/me smacks RoloKip around with a wet trout

Edit:  P.S.  Shamefully admitting I forgot to type this bit of the message.:crying:

THANK YOU to EVERYONE that has contributed to this project.  Even if no-one in the future ever recognizes your efforts, please know that they ARE appreciated, and will be even by the folks that don't say so, simply because things will be there, and easily found etc... no one ever wishes to thank folks for things that work correctly, we all tend to just "expect it" to work. 

So, again, Thank You, VERY MUCH for all your efforts.:wizard:

Modifié par Bannor Bloodfist, 20 octobre 2012 - 06:51 .


#93
Rolo Kipp

Rolo Kipp
  • Members
  • 2 791 messages
<wipes his face...>

Hey!
Hmmm...
*Just* what Cestus needs for the Fishmonger Cart. Thanks for the fish!

<...with his long ugly scarf>

#94
Lightfoot8

Lightfoot8
  • Members
  • 2 535 messages
Just to throw a possiable monky wrentch in the mix.

If you have control of the site that you are planning on uploading all of this too. Is it not possiable to just send the site the information of what needs to be uploaded from the vault and have it do the uploading. In effect cutting out the middle man. I am no web guru, But is seems like something like that could be done.

basicly in the form that gives the option of "uploading a file" you give an option of uploading the file from the web. It would cut the time needed in at least half. most likely even more then that sine most people's upload speed is a lot less then there download speeds.

#95
Bannor Bloodfist

Bannor Bloodfist
  • Members
  • 929 messages
Please re-read my post located 3 above this one.

#96
Rolo Kipp

Rolo Kipp
  • Members
  • 2 791 messages
<directing...>

ACD's perl script is doing that now. Running it on the site, it's grabbing the files, screenshots and metadata for each project.

Then Meaglyn's script will extract and massage the metadata into xml format for importing into the drupal mysql db, which will re-link all those files & screenshots with the project.

Host pipeline is *much* faster than wireless at Starbucks! =P

<...traffic>

#97
Bard Simpson

Bard Simpson
  • Members
  • 162 messages
This. Is. Awesome. You guys rock! As Bannor said, you probably won't get all the thanks you deserve, but I want to express my gratitude right now. This is a great idea, a tremendous job, and of immeasurable value. Thank you very much!

#98
KlatchainCoffee

KlatchainCoffee
  • Members
  • 258 messages
What Bard Simpson said. :)

#99
Zwerkules

Zwerkules
  • Members
  • 1 321 messages
What the coffee that makes you knurd said. ;)

#100
Rolo Kipp

Rolo Kipp
  • Members
  • 2 791 messages
<feeling all warm and fuzzy...>

Feedback is *always* appreciated.

Unfortunately, I've hit a huge snag. "Unlimited storage" is limited by "Clients may not use GreenGeeks servers for file storage unrelated to the client's web site, storage Space is for active web site file pages only. "

I.e. I have 48 hours to remove all those files I've collected :-(
So, either I find someone willing to host the files, find a new host that will host the files or point back to the IGN server (which defeats the purpose of mirroring the Vault).

Don't really know what to do, other than flush what I've done.

And I think this means I'll be moving on from GG, as I don't much appreciate how they handled my <blunder?> oversight.

And it's raining. *sigh*

Gonna go write sumpin. <wanna snail, boss?>
Not right now, Bother.

<...despite this cold cruel world>