Web Hosting Forum | Lunarpages


*
Welcome, Guest. Please login or register.
Did you miss your activation email?



Login with username, password and session length
February 09, 2012, 05:36:37 PM

Pages: [1]   Go Down
  Print  
Author Topic: MySQL query problem call for help  (Read 570 times)
dknife
Space Explorer
***
Offline Offline

Posts: 7


« on: March 11, 2010, 09:17:00 AM »

Wondering if there is anyone decent with MySQL that could give me some help. I'm trying to pull a count and sum result but no matter how I structure the query, it is doing a sum of values I don't want.

table 1: relevant entries from selling_history

Code:
+-----+-----------+----------+-----------+------------+--------+--------+------------+
| id  | FirstName | LastName | Office    | agentMLSID | MLSID  | price  | date       |
+-----+-----------+----------+-----------+------------+--------+--------+------------+
| 108 | Sharon    | Name1    | Somewhere |       8401 | 254941 | 185000 | 1266339600 |
| 125 | R. Ashley | Name2    | Somewhere |       4153 | 258477 | 160000 | 1266944400 |
| 126 | Sharon    | Name1    | Somewhere |       8401 | 258477 | 160000 | 1266944400 |
| 127 | Ronald    | Name1    | Somewhere |       8671 | 258477 | 160000 | 1266944400 |
| 140 | R. Ashley | Name2    | Somewhere |       4153 | 255686 | 121000 | 1267203600 |
| 144 | Sharon    | Name1    | Somewhere |       8401 | 257161 |  65900 | 1267203600 |
+-----+-----------+----------+-----------+------------+--------+--------+------------+

table 2: relevant fields from users

Code:
+-----------+----------+--------+
| firstname | surname  | teamid |
+-----------+----------+--------+
| Ronald    | Name1    |      5 |
| Sharon    | Name1    |      5 |
| R. Ashley | Name2    |      5 |
+-----------+----------+--------+

table 3: relevant row from teams
Code:
+--------+----------+------------------------+
| teamid | leaderid | teamname               |
+--------+----------+------------------------+
|      5 |      117 | Big Seller Group       |
+--------+----------+------------------------+

First example query:

Code:
select teamname, selling_history.mlsid, selling_history.price from selling_history
  left join (users,teams) on (selling_history.firstname = users.firstname
  and selling_history.lastname = users.surname and users.teamid = teams.teamid)
  where users.teamid > 0 and selling_history.Office LIKE 'Somewhere' and date > 1265000461 and date < 1267376399;

+------------------------+--------+--------+
| teamname               | mlsid  | price  |
+------------------------+--------+--------+
| Another team           | 254375 | 135000 |
| Another team           | 254375 | 135000 |
| Big Seller Group       | 258477 | 160000 |
| Big Seller Group       | 254941 | 185000 |
| Big Seller Group       | 258477 | 160000 |
| Big Seller Group       | 257161 |  65900 |
| Big Seller Group       | 258477 | 160000 |
| Big Seller Group       | 255686 | 121000 |
| Another team2          | 158421 |  87000 |
+------------------------+--------+--------+

Now my intention is to get the total volume of sales for a team, but not duplicated ones. The duplicates are indicated by the mlsid (in this case, specifically 258477). I cannot do distinct price because there may be other sales by the same team with the same price but different mlsid.

Next query:
Code:
select count(distinct selling_history.mlsid) as count, teamname, selling_history.price from selling_history
  left join (users,teams) on (selling_history.firstname = users.firstname
  and selling_history.lastname = users.surname and users.teamid = teams.teamid)
  where users.teamid > 0 and selling_history.Office LIKE 'Somewhere' and date > 1265000461 and date < 1267376399 group by teamname;
+-------+------------------------+--------+
| count | teamname               | total  |
+-------+------------------------+--------+
|     4 | Big Seller Group       | 160000 |
|     1 | Another team 2         |  87000 |
|     1 | Another team           | 135000 |
+-------+------------------------+--------+

That result for the count is correct, it should only be 4 unique sales. However when I also add in the sum:

Code:
select count(distinct selling_history.mlsid) as count, teamname, sum(selling_history.price) as total from selling_history
  left join (users,teams) on (selling_history.firstname = users.firstname
  and selling_history.lastname = users.surname and users.teamid = teams.teamid)
  where users.teamid > 0 and selling_history.Office LIKE 'Somewhere' and date > 1265000461 and date < 1267376399 group by teamname;
+-------+------------------------+--------+
| count | teamname               | total  |
+-------+------------------------+--------+
|     4 | Big Seller Group       | 851900 |
|     1 | Another team 2         |  87000 |
|     1 | Another team           | 270000 |
+-------+------------------------+--------+

It then adds all 6 previous rows together, instead of just the 4 rows. The total should be 531900. I cannot do a distinct on the price. I've been stuck on this query for the last day and no matter how I structure I cannot get a valid result.

Any suggestions?
« Last Edit: March 11, 2010, 01:30:54 PM by dknife » Logged
MrPhil
Berserker Poster
*****
Offline Offline

Posts: 5083



« Reply #1 on: March 11, 2010, 01:10:46 PM »

I'm trying to slog my way through your tables and queries, but it's very confusing. First, could you please go back and Modify your post to 1) add spaces to make columns line up, and 2) break up the SQL queries into multiple lines, so we don't have to scroll left and right all over the place? Thanks.

Next, your tables are very poorly structured. You have lots of redundant data, which wastes space and introduces the risk of having inconsistent data. First, I would have an agent table:
Code:
agentMLSID,  FirstName, LastName, OfficeID
I'm assuming that the OfficeID doesn't change all that often. If it does, you could keep the OfficeID with the selling_history or teams table.

I don't know if the teams change membership, or are fairly fixed. If they're fixed, they can have a teams table:
Code:
teamID, teamName, leaderID
and a teamMembers table listing who's in the team:
Code:
teamID, agentMLSID
If team membership is fairly fluid, you'd need to do something else.

Then your selling_history table becomes
Code:
id, agentMLSID, MLSID, price, date
I take it the MLSID is the ID of the sale (transaction) itself? What then is "id"? If a given team handles the sale, you then have only one row per sale, and your problem goes away (replace agentMLSID by teamID).

From your question, I'm guessing that multiple agents, on a team, can participate in a sale, but you only want to count the value of a sale once for a team, not once for each team member (agent)? In "next query", "total" is just a placeholder until the next step? It doesn't seem to mean anything in that context, although it may give a clue as to what's going wrong. As I said above, crediting a sale to a team would leave you with only one row per sale, which should clear up the problem (plus, eliminate redundant data for price and date). If team membership is very fluid, are you willing to create new "teams" for every combination of agents that makes a sale?
Logged

dknife
Space Explorer
***
Offline Offline

Posts: 7


« Reply #2 on: March 11, 2010, 01:29:33 PM »

I've fixed up the formatting, sorry about that! And thanks for the reply.

I'll explain a bit about the db structure. The selling_history table is completely separate from the rest of the database in regards to normalization and structure. This is because all the information and fields in that table are an export from another program. The only data I can cross reference between the selling_history data and the users themselves is the firstname, lastname and office. They will match. Ignore the agentMLSID, that I included in the export to hopefully reference later as an id match, instead of horrible name matching.

The MLS ID is an ID given to a property in real estate, which is what these sales are, realtors selling property. The ID itself in that table is the primary key. I store the members of a team in the teams table, which is referenced by the teamid in the users table. I don't have any team information from the export in the selling_history, and I need to combine the sales volume data into a team, instead of individually (in the case that they are actually in a team, but I can do UNION's to get the rest of the data). So I'm matching the first and last name to the users table.

So due to the fact that I'm exporting the data from another program and trying to work with what I have, that is why it is currently in the mess it is. So whilst your suggestions would be very good and if I was storing the transactions myself I would have done so in a far more normalized fashion, unfortunately I don't have that luxury.

Quote
In "next query", "total" is just a placeholder until the next step? It doesn't seem to mean anything in that context, although it may give a clue as to what's going wrong.

Yeah that was a mistake when posting here as I just took out the sum() from the last query but didn't remove the "as total".
« Last Edit: March 11, 2010, 01:32:00 PM by dknife » Logged
MrPhil
Berserker Poster
*****
Offline Offline

Posts: 5083



« Reply #3 on: March 11, 2010, 05:58:34 PM »

OK, my understanding is that FROM and WHERE create a virtual table with (in this case) 6 rows for "Big Sellers Group". SUM(selling_history.price) will add up all 6 rows of this virtual table. SUM(DISTINCT selling_history.price) would work, as you've noted, except that there might be two different sales at the same price.

There might be a way to do this in a single query, which I can't see right now, but how about an alternative? Instead of summing up the column, how about returning the individual rows (ORDER BY teamname, mlsid). Read each row in PHP and calculate the running total:
Code:
$results = array();
$current_team = '';

while ($row = mysql_fetch_array($query_result)) {
   if ($row['teamname'] != $current_team) {
      // start total for a new team, after saving away old result
      if ($current_team != '') {
         $results[] = array($count, $current_team, $total);
      }
      $total = 0;
      $count = 0;
      $current_team = $row['teamname'];
      $current_mlsID = 0;
   }

   // add to total and count only if not same sale as before
   if ($current_mlsID != $row['mlsID']) {
      $total += $row['price'];
      $count++;
      $current_mlsID = $row['mlsID'];
   }
}

Well, not as pretty as a single query delivering the goods prepackaged, but maybe easier to understand?
Logged

dknife
Space Explorer
***
Offline Offline

Posts: 7


« Reply #4 on: March 15, 2010, 07:54:49 AM »

Thanks for the suggestion, I was looking at the possibility of doing it in php but I wanted it all in a query sa I also want to union more queries for the rest of the non-team data and do the sorting on the sql side rather than having to deal with multi-dimensional arrays in php.

Suggestion from another forum:

SELECT COUNT(*) count
     , t.teamname
     , SUM(sh1.price) total
  FROM selling_history sh1
  LEFT
  JOIN selling_history sh2
    ON sh2.mlsid = sh1.mlsid
   AND sh2.id < sh1.id
  JOIN users u
    ON u.firstname = sh1.firstname
   AND u.surname = sh1.lastname
  JOIN teams t
    ON t.teamid = u.teamid
 WHERE sh2.id IS NULL
 GROUP
    BY t.teamid;

That works perfectly with the specific rows in question. I figured out why it is not working for me though. Due to it being a real estate transaction, and multiple agents involved, there was a 4th agent in this transaction which is not part of the team, just to throw a spanner in the works!

Code:
mysql> select * from selling_history where mlsid = 258477;
+-----+-----------+----------+-----------+------------+--------+--------+------------+
| id  | FirstName | LastName | Office    | agentMLSID | MLSID  | price  | date       |
+-----+-----------+----------+-----------+------------+--------+--------+------------+
| 124 | Terry     | Name3    | Lancaster |          0 | 258477 | 160000 | 1266944400 |
| 125 | R. Ashley | Name2    | Lancaster |       4153 | 258477 | 160000 | 1266944400 |
| 126 | Sharon    | Name1    | Lancaster |       8401 | 258477 | 160000 | 1266944400 |
| 127 | Ronald    | Name1    | Lancaster |       8671 | 258477 | 160000 | 1266944400 |
+-----+-----------+----------+-----------+------------+--------+--------+------------+
Logged
dknife
Space Explorer
***
Offline Offline

Posts: 7


« Reply #5 on: March 22, 2010, 01:29:46 PM »

Just wanted to say thankyou for your suggestion, MrPhil, I ended up using your code with a sorted multi-dimensional array and it worked perfectly.
Logged
Pages: [1]   Go Up
  Print  
 
Jump to: