HTML Logo by World Wide Web Consortium (www.w3.org). Click to learn more about our commitment to accessibility and standards.

Moving forward with Composr

ocPortal has been relaunched as Composr CMS, which is now in beta. ocPortal 9 will be superseded by Composr 10.

Head over to compo.sr for our new site, and to our migration roadmap. Existing ocPortal member accounts have been mirrored.


Drupal 5 to ocP

Login / Search

 [ Join | More ]
 Add topic 
Posted
Rating:
#87309 (In Topic #17804)
Avatar

Well-settled

I'm looking for input on "fix or port" for a live site

I have a problem. I have a damaged web site running (OK, limping) on Drupal 5. At this point action is needed. I wasn't supposed to have gotten into this mess but I'm here.

The web site in question is NicaLiving, a site which has a few thousand users (a few hundred active), about 20,000 articles and 100,000 comments. In addition, it has a few thousand photos. Most of the site problems are a result of a bad shared hosting environment resulting in screwed up tables. There are some non-fatal errors and an issue with photo galeries. Any attempt to do a normal Drupal upgrade to version 6 results in a fatal error. As Drupal 7 does not offer photo galleries that can do what is currently done, the Drupal path is a dead end.

What I am thinking (and I am not the ocP expert here which is why I am looking for ideas/input) is that I can easily import the users and user blogs. (By easily I mean with some scaffolding but that's what awk or Python is for.) There are what Drupal calls books which are hierarchical pages which should easily import into CEDI, again with some scaffolding. I expect that the photo gallery import should be easy as well.

The big issue I see is importing the forums. There are clearly lots of things that need to be pruned from them but we are talking thousands of posts or comments. It might make sense to turn the current forums into something read-only (possibly another CEDI area) and start over seeding the forums with some important messages.

I know all this could be done and I am confident that, ultimately ocP is the right fit for NicaLiving. The big issue is the transition. I would only attempt this if the site, for some level of functionality, is only missing for a day or less. After previous disasters, I cannot afford to have no NicaraLiving for days or weeks.

What I am thinking is that a lot of stuff can be pre-loaded so there is a site to move to. For example, photo galleries and books have few changes so they could be put in place ahead of the official move. But, the bulk of the content is really in the forums.

Thus, I am looking for ideas here. If you have done something like this, speak up. Or, if you think this is totally wacko, speak up as well. My alternative is to stick my head into old Drupal code and make things work. Depressing but an alternative.

Note that what I had intended on doing was implementing a simpler web site in an open discussion with the end result being some useful documentation for ocP. Unfortunately, that project needs to get delayed because of this mess. It also may be an interesting lerning experience resulting in some useful documenttion but not what I really had intended.
Back to the top
 
Posted
Rating:
#87310
Avatar

What I'd suggest as a next step is to come up with a list of each content type that needs to be transferred, then create a topic for that content type, linking back to this one. An example of that content (a link to see it) would be great, and even better would be some kind of data dump (which I appreciate may or may not be possible based on data privacy).

We can then break it down and address each separately.

As discussed over e-mail, our approach is to have a neutral way for handling each kind of content, and I'll help out once we have your data out into that kind of neutral form.

For forums, as there is no real neutral format, phpBB or SMF would be a good one. We can import these well, so if you can convert the Drupal forums (which personally I know nothing about) into phpBB/SMF it would be a big start.

I'd imagine you'd want to start with forums anyway, as presumably this would be the vector for transferring accounts, which presumably quite a lot hangs on.


Become a fan of ocPortal on Facebook or add me as a friend. Add me on on Twitter.
Was I helpful?
  • If not, please let us know how we can do better (please try and propose any bigger ideas in such a way that they are fundable and scalable).
  • If so, please let others know about ocPortal whenever you see the opportunity.
  • If my reply is too Vulcan or expressed too much in business-strategy terms, and not particularly personal, I apologise. As a company & project maintainer, time is very limited to me, so usually when I write a reply I try and make it generic advice to all readers. I'm also naturally a joined-up thinker, so I always express my thoughts in combined business and technical terms. I recognise not everyone likes that, don't let my Vulcan-thinking stop you enjoying ocPortal on fun personal projects.
  • If my response can inspire a community tutorial, that's a great way of giving back to the project as a user.
Back to the top
 
Posted
Rating:
#87322
Avatar

Community saint

felipe said

The big issue is the transition. I would only attempt this if the site, for some level of functionality, is only missing for a day or less. After previous disasters, I cannot afford to have no NicaraLiving for days or weeks.
You can mitigate that by doing dry-runs, importing into fresh test installs of ocPortal. During those runs you will also get a feel for how long it will take so you can get a good idea of how much down-time will be required.

Do you have a Samsung Galaxy S / Galaxy S II ? If so, why not check out my ScreenFree FM Radio .
Back to the top
 
Posted
Rating:
#87327
Avatar

Well-settled

Thank you both for your comments. I am still waiting for someone to tell me I am crazy but, I can wait. :-) Let me add some more thoughts. When I get serious I agree that i need to deal with each type of data separately and it does make sense to describe how to deal with each type in a separate post. Hopefully what I learn can be useful to others.

Chris, you see Forums as the starting point. I see them as the last thing I would deal with. Let me explain why.

It appears there are a bit over 5000 forum posts. While I need to do more homework to count them, I would expect the majority of the comments are in the forums as well. Let's pick 70,000 as a working number for now. (Counting them requires a bit more work as forums are integrated with the rest of Drupal so it requires two more table lookups to determine if a comment is in a forum vs. a blog or other site content.)

Again, with a bit of guessing, this makes Forum threads appear to be about 1/3 of all the non-comment content. It appears that there are about the same number of gallery entries so the remaining 1/3 would be blog posts, articles, book pages and such. Thus, a significant piece of the site content.

But, here is why I put forums at a lower priority than other content. The primary purpose of the site is to offer a place for people to find answers about moving to/living in Nicaragua. The site has existed for over eight years. The majority of what you find in the forums is not particularly useful. That includes a lot of outdated information, a lot of repeated information and, at least in forum comments, off-topic bickering.

"On the list" to improve the site is to cull out useful information and move it to the more authoritative parts of the site. In the Drupal implementation, this is put in what Drupal calls books which are hierarchical structures of pages (with comments). A small group of people has been responsible for these book pages so while there can be outdated information, there is a lot less noise.

While this is hard to determine, my SWAG (scientific wild-assed guess) is that from the 5000 forum threads and their associated comments you will get:
  • A few hundred (let's say 500) news articles. (These really are not forum discussions but that was the best way in Drupal to deal with them.) There are associated comments. All this should be moved to a categorized news area rather than being part of discussion forums. Assuming they are posted in the right place (there are a few forum topics for these which would make good categories) this should be an easy transition.
  • Maybe 1000 forum threads which discuss things which belong in a more authoritative and easier to search place on the site. From these 1000 threads (which probably means posts plus 10,000 to 20,000 comments), I would expect to create 100 authoritative pages.
  • Most of the remaining forums can be forgotten. That is, they never were particularly useful or, today, are meaningless.
Thus, while the heart of many sites are the Forums, I don't feel that is the case with NicaLiving. This is not intended to imply that, in the future, forums will not be important but that what is out there, partly due to limitations of the original Drupal platform when the site was initially created, it seems that housecleaning time has arrived.

I am leaning toward the following transition approach.
  • Create an ocP-based site with features essentially equivalent to the current site.
  • Build tools to port users (first), and blogs to the new site.
  • Build a tool to create the photo galleries on ocP, preserving their current gallery structure.
  • Using CEDI, create a structure similar to what is held in Books on the current site. While this process might be automated, it may make more sense to manually update the content in the process. I need to evaluate the magnitude of the work involved.
  • Extract news posts from the forums and create categorized news on the new site.
  • Create a forum structure that pretty much reflects the current forum structure. (There are some obvious fixes plus there is the elimination of the news sections.)
  • Move the user base to the new ocP-based system leaving the Forum section of the old site available in read-only mode for an indetermined amount of time. (Watching the traffic should tell us what is missing from the new site and when the old site is no longer useful.)
To me, this approach makes sense in terms of current site usage patterns. It also should make the transition possible with very little downtime as much of the new site can be built while the current site continues to operate.

The biggest interruption I see is the change in URLs. There are some (I would guess more than 100 but less than 500) URLs embedded in the content. A redirect could be used for anything important but, for the most part, redirecting 404s to a search page should be sufficient. About 70% of the site traffic comes from search engines so this issue quickly should become insignificant.
Back to the top
 
Posted
Rating:
#87328
Avatar

users –> CSV –> ocPortal
blogs –> RSS –> ocPortal

About 70% of the site traffic comes from search engines so this issue quickly should become insignificant.

Well I'd make kind of the opposite conclusion to that fact – you want the search engine to maintain its ranking for those URLs that currently are ranked, by setting up redirects.

You may want to at least look at the ones you get most Google hits from and remap those via a .htaccess.


Become a fan of ocPortal on Facebook or add me as a friend. Add me on on Twitter.
Was I helpful?
  • If not, please let us know how we can do better (please try and propose any bigger ideas in such a way that they are fundable and scalable).
  • If so, please let others know about ocPortal whenever you see the opportunity.
  • If my reply is too Vulcan or expressed too much in business-strategy terms, and not particularly personal, I apologise. As a company & project maintainer, time is very limited to me, so usually when I write a reply I try and make it generic advice to all readers. I'm also naturally a joined-up thinker, so I always express my thoughts in combined business and technical terms. I recognise not everyone likes that, don't let my Vulcan-thinking stop you enjoying ocPortal on fun personal projects.
  • If my response can inspire a community tutorial, that's a great way of giving back to the project as a user.
Back to the top
 
Posted
Rating:
#87361
Avatar

Well-settled

This is a quick update. During my research to figure out how to get data out of the existing site, I stumbled upon something that might be the cause of the failure of the Drupal 5 to 6 update. That doesn't mean the conversion to ocP is not going to happen but it may not need to happen quickly.

For anyone with an existing Drupal 5 site considering moving to ocP, there are huge advantages of upgrading to Drupal 6 first. There are many tools that exist in Drupal 6 that either don't exist or exist in very primitive forms in Drupal 5 that will make the process a lot easier. They include:
  • The migrate module. While designed to import into a Drupal site from a non-Drupal site, recent development has been on Drupal to Drupal ports. The module uses PHP Classes to define the from and to data formats. Recent work had added the necessary classes to export Drupal 6 data. Thus, you don't need to understand what Drupal 6 does inside to get the data out. No such work has been done for Drupal 5.
  • Drush is a CLI program with a host of utility features. While it has been around since Drupal 4.7, there are many enhancements in the version for Drupal 6 (and 7).
Hopefully I can get NicaLiving running on Drupal 6 to offer a bit of breathing room for the conversion and make the conversion process easier. I will keep you posted.
Back to the top
 
Posted
Rating:
#87620
Avatar

Well-settled

Here is the update. No luck with the move to D6 but I have the D5 version working "OK" for now. That is, it is sick internally but the user doesn't see it.
Chris suggested CSV import for user info and RSS for something else. As neither is the way I think I want to discuss this a bit more before looking at the specitics of each type of data. Let me explain how I would, generically, approach this.
I would want to do all the data repair before it got to the target. If possible, I would want the repair system to be independent of both the source and destination. That is, a unit that can be run/debugged on its own. Without looking at the details, I would most likely write a program (probably in Python) which would read the existing database tables and massage the data info a format that could then be imported by an importer which is part of the destination system.
Let me take the basics of the user account data as an example. There is no single source table that will produce what is needed. In fact, I know there at least three and probably more. )Note that this is not unique to user account data in Drupal. Sometimes multiple tables make sense, other times it is just a kludge.) The program would then:
  • Read user file entries extracting what makes sense directly (user ID, name, email, for example)
  • Massage some other fields (timezone, for example) into the format the destination system would want
  • Do lookups in other tables and massage other information (for example, permissions)
This program could be run/debugged without having to interrupt the operation of the current site. Test importats could then be done once the format seemed correct to make sure things worked.
I would then go on to develop the programs for the other types of data -- photos, forum posts, comments, blog entries, ... in a similar manner.
When all the pieces worked, the current site could be turned off (or put in read-only mode), all the export/convert scripts run and then all the import scripts run. Then the users would be pointed at the new site.

To me, this is a relatively small conversion (about 7000 users, 100,000 data records of assorted flavors) so execution time of the jobs should be pretty short. But, there are lots of cleanup items I would want to do with the user information and data at this time. I cannot see any other way to do this that would allow the cleanup to progress automatically. (Also, this process is nothing new. I have moved millions of records from one database system to another before and, well, this is the way I have always done it.)

Comments?
Back to the top
 
Posted
Rating:
#87621
Avatar

Yeah, that's what I mean too :). I don't mean you literally make a CSV by hand in Libre Office or whatever – I mean you make/get some tool that generates one from Drupal automatically. Then import into ocPortal is simple as we already support it.

This CSV is your independent source, but it's not a manual process.


Become a fan of ocPortal on Facebook or add me as a friend. Add me on on Twitter.
Was I helpful?
  • If not, please let us know how we can do better (please try and propose any bigger ideas in such a way that they are fundable and scalable).
  • If so, please let others know about ocPortal whenever you see the opportunity.
  • If my reply is too Vulcan or expressed too much in business-strategy terms, and not particularly personal, I apologise. As a company & project maintainer, time is very limited to me, so usually when I write a reply I try and make it generic advice to all readers. I'm also naturally a joined-up thinker, so I always express my thoughts in combined business and technical terms. I recognise not everyone likes that, don't let my Vulcan-thinking stop you enjoying ocPortal on fun personal projects.
  • If my response can inspire a community tutorial, that's a great way of giving back to the project as a user.
Back to the top
 
Posted
Rating:
#87639
Avatar

Well-settled

I haven't tried to import anything so maybe trying will answer the question but what I did not see wasany indication of what fields you need for each type of data.
Back to the top
 
Posted
Rating:
#87640
Avatar

Best way to find out is to do a members CSV export from a clean ocPortal site and use that as a reference.

Any unrecognised columns import into new CPFs (Custom profile fields).


Become a fan of ocPortal on Facebook or add me as a friend. Add me on on Twitter.
Was I helpful?
  • If not, please let us know how we can do better (please try and propose any bigger ideas in such a way that they are fundable and scalable).
  • If so, please let others know about ocPortal whenever you see the opportunity.
  • If my reply is too Vulcan or expressed too much in business-strategy terms, and not particularly personal, I apologise. As a company & project maintainer, time is very limited to me, so usually when I write a reply I try and make it generic advice to all readers. I'm also naturally a joined-up thinker, so I always express my thoughts in combined business and technical terms. I recognise not everyone likes that, don't let my Vulcan-thinking stop you enjoying ocPortal on fun personal projects.
  • If my response can inspire a community tutorial, that's a great way of giving back to the project as a user.
Back to the top
 
Posted
Rating:
#87648
Avatar

Well-settled

Perfect. That is a great example of a "how to import" tip.

Note that Drupal (or at least Drupal 5) does not have a CSV export module. Because of the table structures, I don't see that this would be a particular advantage but I did want to mention it.
Back to the top
 
Posted
Rating:
#87655
Avatar

Well-settled

Chris Graham said

Best way to find out is to do a members CSV export from a clean ocPortal site and use that as a reference.

Not trying to pick on you Chris, but this is another one of those good examples where it's easy but it wasn't easy for this geek to find it. Why? Because what I needed to find was "download members spreadsheet". I used the admin search and looked for export. Found XML export just fine.

Back to the top
 
Posted
Rating:
#87657
Avatar

I'll check what the admin search finds for that, but you'll find this under Admin Zone > Tools > Members.


Become a fan of ocPortal on Facebook or add me as a friend. Add me on on Twitter.
Was I helpful?
  • If not, please let us know how we can do better (please try and propose any bigger ideas in such a way that they are fundable and scalable).
  • If so, please let others know about ocPortal whenever you see the opportunity.
  • If my reply is too Vulcan or expressed too much in business-strategy terms, and not particularly personal, I apologise. As a company & project maintainer, time is very limited to me, so usually when I write a reply I try and make it generic advice to all readers. I'm also naturally a joined-up thinker, so I always express my thoughts in combined business and technical terms. I recognise not everyone likes that, don't let my Vulcan-thinking stop you enjoying ocPortal on fun personal projects.
  • If my response can inspire a community tutorial, that's a great way of giving back to the project as a user.
Back to the top
 
Posted
Rating:
#87660
Avatar

Well-settled

Oh, I found it. I am just trying to point out what is apparently a different "thinking path" you take and I take. You see it as "something to do with members"->export them and I saw it as "export something"->members.

I mainly bring this up because this is happening to me all the time. Maybe my thinking is "broken" but this tends to be my major source of frustration getting familiar with ocP.

So, bottom line, I think the book needs to help "us" look in the right place for things.
Back to the top
 
Posted
Rating:
#87662
Avatar

Sure. Please keep mentioning -- because when people do I really do try and tweak the synonyms in the Admin Zone search to ensure the right thing comes up.


Become a fan of ocPortal on Facebook or add me as a friend. Add me on on Twitter.
Was I helpful?
  • If not, please let us know how we can do better (please try and propose any bigger ideas in such a way that they are fundable and scalable).
  • If so, please let others know about ocPortal whenever you see the opportunity.
  • If my reply is too Vulcan or expressed too much in business-strategy terms, and not particularly personal, I apologise. As a company & project maintainer, time is very limited to me, so usually when I write a reply I try and make it generic advice to all readers. I'm also naturally a joined-up thinker, so I always express my thoughts in combined business and technical terms. I recognise not everyone likes that, don't let my Vulcan-thinking stop you enjoying ocPortal on fun personal projects.
  • If my response can inspire a community tutorial, that's a great way of giving back to the project as a user.
Back to the top
 
Posted
Item has a rating of 5 (Liked by sholzy)  
Rating:
#87670
Avatar

That was a great test case. I now have a search for export members returning "Download member spreadsheet (CSV)" as the top result.

This was done by making 'export' an 'download' synonyms, and by making the search default to an 'AND' search, falling back to an 'OR' search only if it couldn't find a match.

So everyone – please do what felipe does and give little bits of feedback like this, it does help.


Become a fan of ocPortal on Facebook or add me as a friend. Add me on on Twitter.
Was I helpful?
  • If not, please let us know how we can do better (please try and propose any bigger ideas in such a way that they are fundable and scalable).
  • If so, please let others know about ocPortal whenever you see the opportunity.
  • If my reply is too Vulcan or expressed too much in business-strategy terms, and not particularly personal, I apologise. As a company & project maintainer, time is very limited to me, so usually when I write a reply I try and make it generic advice to all readers. I'm also naturally a joined-up thinker, so I always express my thoughts in combined business and technical terms. I recognise not everyone likes that, don't let my Vulcan-thinking stop you enjoying ocPortal on fun personal projects.
  • If my response can inspire a community tutorial, that's a great way of giving back to the project as a user.
Back to the top
 
1 guests and 0 members have just viewed this: None
Control functions:

Quick reply   Contract

Your name:
Your message: