Web Hosting Forum | Lunarpages
News: April 3, 2008 - New Contest! - Win 5 Years of Hosting and $1,000!
May 5, 2008 - May 08 Web Site of the Month? - Submit your LINKS!!!
 
*
Welcome, Guest. Please login or register.
Did you miss your activation email?
May 16, 2008, 08:06:17 PM


Login with username, password and session length


Pages: [1]   Go Down
  Print  
Author Topic: Site organization for large cluttered website  (Read 887 times)
nebulous773
Newbie
*
Offline Offline

Posts: 2


« on: February 06, 2008, 08:09:43 AM »

I have a web site which consists of about 3000-4000 html web pages.   
At this point I am the only one left working on this project to get this website launched.
The problem I have and I am hoping that there is a resolution is this:  I need to see all the files that exist on the server at this moment.
The ideal would be a sitemap but since a great many of the pages are not yet linked, I don't see how I can generate such a map.

Does anybody have any wisdom or experience to share as to how I can generate something, a list, a map, anything will show all the files that are on the server at this moment.

Summary: The files are all uploaded.  They need revision and they need to be interconnect through links.   I need a way to "see" all the files and be able to check them off as I work on them and get them online and working.

Any thoughts? 
Thanks.
Logged
wektech
Jabba the Hutt
*****
Offline Offline

Posts: 686


WWW
« Reply #1 on: February 06, 2008, 08:14:19 AM »

Depending on how these files are spread between folders, any good FTP tool should give you a pretty good picture.
Logged

HazardTW
Intergalactic Cowboy
*****
Offline Offline

Posts: 54


« Reply #2 on: February 06, 2008, 09:57:48 AM »

I threw this together for a quick and dirty list of folders and files, just copy this as a .php file to your web root folder(it could go in a deeper folder, but just put it at root since if it is in an add-on domain folder, it will not be allowed to go up to the main web root).

It will recursively read every folder and add each file and folder to an array for each folder.
When it is done it will iterate through the array of folders and list everything that is in that folder.
It will show the folder name in bold red, then under that it will list the files in black, and sub folders of that folder in bold green.

Like I said I just threw it together and it looks like it worked on mine just fine, however I did not do a thorough check to see if any folders or files were missed since the structure I ran this on came back with 263 folders containing 5548 files, but looking at the root it did pick up all folders there, and the couple of sub folders that I checked.

Code:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<title>Untitled Document</title>
<style type="text/css">
<!--

.folder{font-weight:bold;color:#093;}

-->
</style>
</head>

<body>
<?php
function folder_list($folder){

global $totalFolders$totalFiles$all_files$rootESC;
if(!preg_match('/\/$/',$folder)) $folder .= "/";
$currentFolder opendir($folder);
$all_files[$folder] = array();
while (false !== ($file readdir($currentFolder))){
 
if ($file != "." && $file != ".."){
$file $folder.$file;
if(is_dir($file)){
$tmp "FOLDER: ".basename($file);
array_push($all_files[$folder],$tmp);
$totalFolders++;
folder_list($file);
}else{
array_push($all_files[$folder],basename($file));
$totalFiles++;
}
}  
}
}


$all_files = array();

$totalFolders 0;
$totalFiles 0;

$outputFile TRUE;

$root $_SERVER['DOCUMENT_ROOT'];
$rootESC preg_replace('/\//','\/',$root);
echo 
"document root: $root<br>";
if(
is_dir($root)) folder_list($root);
echo 
"<h3 style='color:blue'>Total Folders: $totalFolders</h3>";
echo 
"<h3 style='color:blue'>Total Files: $totalFiles</h3>";
if(
$outputFile$handle fopen('web_directory.txt','w');
if(
$outputFile && $handle){
fwrite($handle,"\n\n========================================================================");
fwrite($handle,"\n\nTotal Folders: $totalFolders\nTotal Files: $totalFiles");
fwrite($handle,"\n\n========================================================================");
}
foreach(
$all_files as $folder=>$files){
$folder preg_replace("/$rootESC/i",'',$folder);
echo "<h3 style='color:#f00;'>FOLDER: $folder</h3>";
if($outputFile && $handle){
fwrite($handle,"\n\n------------------------------------------------------------------------");
fwrite($handle,"\nFOLDER: $folder\n\n");
}
foreach($files as $f){
echo "<span class='";
if(preg_match('/^FOLDER:/',$f)){
echo 'folder';
}
echo "'>---&gt; ".$f."</span><br>";
if($outputFile && $handle){
fwrite($handle,"$f\n");
}
}
}
if(
$handlefclose($handle);
?>

</body>
</html>

EDIT:  I modified the code, by default it will also create a text file of your web directory named "web_directory.txt" and reside in the same folder as this script.

If you don't want it to create the text file you can change $outputFile = TRUE;  to $outputFile = FALSE;


NOTE!!    I should mention that you probably don't want to leave this file sitting on your server as it is, if it got linked to, somebody would have a COMPLETE map of your website structure and contents!

« Last Edit: February 06, 2008, 11:46:11 AM by HazardTW » Logged
nebulous773
Newbie
*
Offline Offline

Posts: 2


« Reply #3 on: February 06, 2008, 12:47:13 PM »

Excellent work with the php script.  You must be a php wizard to just throw that together.  Its a great little program and YES, it does show all the folders in the directory and YES (!) it also shows the files residing within.   This is really going to help me visualize what I need to do to put this project back together. 
Thank you my friend.   Your talents are very much appreciated.
Logged
HazardTW
Intergalactic Cowboy
*****
Offline Offline

Posts: 54


« Reply #4 on: February 06, 2008, 05:12:16 PM »

You are welcome, I feel your pain when it comes to trying to visualize a large file structure.

I didn't have time earlier to play with it more, I had to go do some actual work(sorry for my use of four-letter words in here).

I started working on it again when I got home, took a little time to get my head around the massive array and the recursive functions to build it and then display it.

I wanted to make it a little better, you should find this version much handier viewing in a web page, and it still creates a text file.

It first displays only folder names, you can click on a folder name to expand and show the files in that folder, click again to hide the files.

It also tabs over for each level down a folder is, overall it should give you a much better mental picture of your site structure, and the number of files in each folder is displayed as well.

Added a setting you can change to true to have alternating background colors.

This one by default does not output a text file, change the setting to true to make it output to text file.

Here is a cropped down screen shot of what it looks like in the browser with one folder open showing the 6 files in it:



Code:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<title>Untitled Document</title>
<style type="text/css">
<!--
.blue{font-weight:bold;color:blue;}
.red{color:red;}
.toggle{
font-weight:normal;
font-size:small;
color:#093;
cursor:pointer;
}
.folder{cursor:pointer;}
a:link{color:black;}
a:visited{color:#999;}
-->
</style>
<script type="text/javascript">
<!--
function toggle(obj){
var child = obj.firstChild;
var sib = child.nextSibling;
while(sib.tagName != 'DIV' && sib.nextSibling){
sib = sib.nextSibling;
}
if(sib.style.display == 'block'){
sib.style.display = 'none';
}else{
sib.style.display = 'block';
}
return;
}
-->
</script>
</head>
<body>
<?php
function folder_list($folder){
global $totalFolders$totalFiles;
if(!preg_match('/\/$/',$folder)) $folder .= "/";
$currentFolder opendir($folder);
$all_files['FOLDERS'] = array();
$all_files['FILES'] = array();
$temp_folders = array();
while (false !== ($file readdir($currentFolder))){
if ($file != "." && $file != ".."){
if(is_dir($folder.$file)){
$totalFolders++;
array_push($temp_folders,$file);
}else{
array_push($all_files['FILES'],$file);
$totalFiles++;
}
}  
}
if(count($temp_folders)){
foreach($temp_folders as $new_folder){
$all_files['FOLDERS'][$new_folder] = folder_list($folder.$new_folder);
}
}
return $all_files;
}
function 
output_directory_structure($directory,$foldername,$tab_level,$bg=0){

global $tab$handle$filetab$bgcolor$bg$alternateBackgroundColor$makeLinks;

$bg $alternateBackgroundColor $bg 3;

echo "<div class='foldername' style='background-color:".$bgcolor[$bg]."'>\r";
for($i 0$i $tab_level$i++){
echo $tab;
if($handlefwrite($handle,$filetab);
}
echo '<span onclick="toggle(this.parentNode)" class="blue folder">'.$foldername.'</span>';
echo count($directory['FILES']) ? " <span onclick='toggle(this.parentNode)' class='toggle'>[ toggle files <span class='red'>(".count($directory['FILES']).")</span>]</span>" "";
echo "<br>\r";
echo '<div style="display:none;">'."\r";
if($handle){
fwrite($handle,html_entity_decode($foldername)."\n");
}
if(count($directory['FILES'])){
foreach($directory['FILES'] as $file){
for($i 0$i $tab_level+1$i++){
echo $tab;
if($handlefwrite($handle,$filetab);
}
$link preg_replace('/&lt;root&gt;/','..',$foldername).'/'.$file;
if($makeLinks) echo "<a href='".$link."'>";
echo $file;
if($makeLinks) echo "</a>";
echo "<br>\r";
if($handlefwrite($handle,$file."\n");
}
}
echo "</div></div>\r";
if($handlefwrite($handle,"\n");
if(count($directory['FOLDERS'])){
foreach($directory['FOLDERS'] as $key=>$folders){
output_directory_structure($folders,$foldername.'/'.$key,$tab_level+1);
}
}
return;
}
//===============================================================================================//

// set to true if you want alternating background colors
$alternateBackgroundColor true;
//  IF YOU WANT TO OUTPUT A TEXT FILE OF THE STRUCTURE, CHANGE $outputFile to equal true //
$outputFile false;
// TRUE WILL MAKE HYPERLINKS OUT OF ALL FILES LISTED 
$makeLinks true;


$dir_structure = array();
$bgcolor = array('#eef','#eff','');
$bg 0;
$totalFolders 0;
$totalFiles 0;
$root $_SERVER['DOCUMENT_ROOT'];
$stime microtime(true);
if(
is_dir($root)) $dir_structure folder_list($root);
$mtime microtime(true);
echo 
"<h3 style='color:blue'>Total Folders: $totalFolders</h3>";
echo 
"<h3 style='color:blue'>Total Files: $totalFiles</h3>";
$tab "<span class='blue'>.</span>--------";
$filetab '.--------';
$handle false;
if(
$outputFile$handle fopen('web_directory.txt','w');
if(
$handle){
fwrite($handle,"\n\n========================================================================");
fwrite($handle,"\n\nTotal Folders: $totalFolders\nTotal Files: $totalFiles");
fwrite($handle,"\n\n========================================================================\n\n");
}
output_directory_structure($dir_structure,'&lt;root&gt;',0);
$etime microtime(true);
echo 
"Time to build data structure: ";
echo 
$mtime-$stime;
echo 
"<br>Time to create web page from data: ";
echo 
$etime-$mtime;
echo 
"<br>Total time: ";
echo 
$etime-$stime;



if(
$handlefclose($handle);
?>

</body>
</html>


** UPDATE:  added setting to make hyperlinks out of every file listed, default is TRUE.
« Last Edit: February 07, 2008, 10:03:28 PM by HazardTW » Logged
MrPhil
Über Jedi
*****
Offline Offline

Posts: 2831


« Reply #5 on: February 07, 2008, 05:38:20 PM »

Besides HazardTW's PHP script and cut-and-paste from any FTP client display, if your account is on a Linux server (all plans except Windows) you can request a list of all directories and files. Go to cPanel > Cron jobs, select a time a few minutes from now (server time, not your local or GMT), and enter ls -alR for the command. After you receive the email with the file listing, remember to delete the cron job so it doesn't get run again tomorrow. If you're on a Windows plan, if there is some kind of command line access or command/task scheduler, I think the magic is dir /s *.* to get a similar listing.

Since you're asking about organizing a site, permit me to put my two cents in. Keep everything you can out of the root directory for your primary domain, subdomain, or add-on domain. Only server stuff should be there, such as .htaccess, robots.txt, favicon.ico, error documents (e.g., 404.shtml), and maybe a few other things. Your site pages should be in a directory under public_html/ (primary domain) or public_html/subdir/ (subdomain or add-on domain). You will of course need to do something to "jump the gap" between the root directory and the one below it -- your index.* file or a redirection in .htaccess (manually edited or via cPanel, but avoid cPanel if you have subdomains or add-on domains) or a dummy index.html that just does a <meta http-equiv="refresh" content="0; url=/maindir/index.html" />. Put all your major subsystems (blogs, forums, stores, galleries, etc.) in their own directories under the (sub)domain root or the (sub)domain's main directory so that they stay out of each other's hair. This is especially important if you're installing/updating/removing canned systems (e.g., via Fantastico) -- you want to keep them nicely isolated. For your home-grown code, try to split it up into a directory tree that logically follows the function of your site. Try to keep a directory under maybe 30 to 40 files, just to keep you from going insane! Anyway, functionally distinct subsystems should not share the same directory, but should be in sibling directories, just to keep everything clean. Don't go overboard and split up a collection of files just because it had 41 members, or end up with a bunch of directories with 1 or 2 files (unless you plan to grow them in the future). Well, that was probably more than two cents, but hey, inflation is back!

Of course, if you had a poor structure with 3 or 4 thousand files in it already, it will not be painless to restructure it! Every link will need to be redone, but it may well be worth the investment. I hope you don't have 4000 files in public_html/!
Logged

caliber
Pong! (the videogame) Master
*****
Offline Offline

Posts: 24



WWW
« Reply #6 on: February 24, 2008, 10:46:16 AM »

Since you're asking about organizing a site, permit me to put my two cents in. Keep everything you can out of the root directory for your primary domain, subdomain, or add-on domain. Only server stuff should be there, such as .htaccess, robots.txt, favicon.ico, error documents (e.g., 404.shtml), and maybe a few other things. Your site pages should be in a directory under public_html/ (primary domain) or public_html/subdir/ (subdomain or add-on domain). You will of course need to do something to "jump the gap" between the root directory and the one below it -- your index.* file or a redirection in .htaccess (manually edited or via cPanel, but avoid cPanel if you have subdomains or add-on domains) or a dummy index.html that just does a <meta http-equiv="refresh" content="0; url=/maindir/index.html" />.

Thanks MrPhil, this is a great post. I don't have a large site, but I try to keep things logically organized and tried to put some thought into my web structure when I first set it up. However, I am learning as I go!

On my site I do have content in the root of public_html. I had started out with content in subdirs but went to using the root to keep my urls simple. Now, I am about to have a new addon domain and I don't care for the thought of having this different domain stuck in the middle of the main site. I was wondering if the content from the main site could be moved to a subdir, but I also worry about how the addressing would work. You mention using redirects. I understand how to do that, but won't it still show as "www.mainsite.com/subdir/page.htm" for the url?


Latisha
Logged
MrPhil
Über Jedi
*****
Offline Offline

Posts: 2831


« Reply #7 on: February 25, 2008, 02:59:43 PM »

You mention using redirects. I understand how to do that, but won't it still show as "www.mainsite.com/subdir/page.htm" for the url?

If you drop your main site into public_html/main/, then normally I think your visitors would see "main" in the URL. I suppose it may be possible with a lot of redirection effort to slip "main/" into the actual path without the visitor seeing it (don't use R=301 on the URL rewrite?), but is it worth it? No non-trivial site should put all its pages into the root level directory, so sooner or later your visitors are going to see some subdirectories in the URL. Is that a problem? And I suspect that any site editor (FrontPage, etc.) is going to have fits trying to keep track of making links to other pages which have to be fudged to give the "short" target address. I don't think it's worth the effort to try to hide your "main" top directory -- I'd recommend just living with it. If you're using PHP for dynamic pages, it might be feasible to have everything as a URL Query String to index.php, but the coding is pretty complicated.
Logged

Pages: [1]   Go Up
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.3 | SMF © 2006-2007, Simple Machines LLC
Seo4Smf v0.2 © Webmaster's Talks


Valid XHTML 1.0! Valid CSS! Dilber MC Theme by HarzeM