So, I’ve finally had some time to get back to the linear programming
followup. You might want to go back and look at the earlier post to remember what I was talking about.
The basic idea is that we’ve got a system we’d like to optimize. The constraints of the system are defined by a set of linear inequalities, and
we’ve got a linear expression that we’d like to maximize.
So, for example, we could have a factory which is capable of producing two
different kinds of widgets. The factory has a maximum capacity of what it can
produce of 300 widgets per day. It gets a supply of metal and plastic for
making widgets: 5000 kilograms of steel, and 100 kilograms of plastic. There
are two kinds of widgets. Type 1 widgets take 23 kg of steel, and 2kg of
plastic, and sell for $900 each. Type 2 widgets take 5kg of steel, and 2
kilograms of plastic, and sell for $700 each. The factories customers need a
minimum of 5 type 1 and 2 type 2 widgets per day. So, the question is, what’s the optimal way to run the factory? How many of each widget should it make?
Using the formulation we discussed last time, we can formulate this
as a linear programming problem as follows:
- Let x1 be the number of type one widgets we produce.
- Let x2 be the number of type two widgets we produce.
- Objective: maximize 900x_{1} + 700x_{2}, where:
- x_{1} + x_{2} ≤ 300
- 23x_{1} + 5x_{2} ≤ 500
- 2x_{1} + 2x_{2} ≤ 50
- x_{1} ≥ 5, x_{2} ≥ 2
Now, how do we get the answer? As I discussed last time,
there’s a conceptually simple algorithm called simplex. The
idea of the simplex algorithm is to do a basic form of something called hill-climbing. The idea of hill-climbing is really simple. You’re looking for the highest possible point in something which is modeled as a surface. So you try to go uphill: each step, you look for the next step which will get you the furthest up.
In simplex, you know that the solution is a vertex of an N-dimensional polygon. So you find an initial vertex – it doesn’t matter which one. Then you look at the edges that are incident on that vertex – and you pick the one that looks like it’ll get you the farthest uphill. You keep doing that until you get to one where no edge will take you to a higher vertex – and you’re at the maximum.
In general, simplex is very fast. Typically, even if you have the bad luck to start at a minimum vertex, you’ll get to the maximum very quickly. But the worst case is pretty bad: you can create LP problems where you’ll wind up visiting every vertex. You can see why by imagining a cube. You can distort the cube so that if you start at the minimum vertex, and always climb the edge with the steepest slope, you’ll have to visit every single vertex. And there are O(2^{n}) vertices in an N-dimensional polygon. As you increase dimensions, you can do the same distortion trick, to force the simplex algorithm to visit all of the vertices. (Or at least O(2^{N}) vertices.) But those cases are very rare, and frankly very contrived. So
in general, you don’t worry about them.
There are other algorithms to solve linear programming problems. There’s a very fast one – fast even in the pathological worst cases – called the interior points method. But for the most part, simplex is the way people go.
If you really need to solve a linear programming problem, you
don’t generally write simplex yourself. Implementing simplex
correctly is a pain. It’s typical of complex numerical analysis problems –
it’s a miserable pain in the ass to implement. But, it’s also such a common
problem that there are a huge number of highly optimized solvers already
written; you just grab one off the shelf. (There’s a lesson in there;
never waste time doing work yourself if you know someone else has already done it.) So, for this post, I grabbed a copy
of maxima, an open-source symbolic algebra system, which
includes a very nice solver.
In the last post, I showed how you can formulate the problem as a matrix, which is the input for the simplex algorithm. But maxima, since it’s so good at symbolic work, doesn’t even require you to do that (although most simplex solvers do take the problem in matrix form). Here’s how I wrote the problem as a maxima script:
load("simplex")$ maximize_lp( 9*x1 + 7*x2, [x1 + x2 <= 300, 23*x1 + 5*x2 <= 500, 2*x1 + 2*x2 <= 50, x1 >= 5, x2 >= 2]), nonegative_lp=true;
When I feed that in to maxima, it converts it to the matrix form,
and then solves it. The answer is that the maximum profit is
about $21,667, which you can make by producing 4 1/6 type 1 widgets,
and 20 5/6 type 2 widgets.
Now, that’s very nice isn’t it? We can maximize the factory’s profit
by producing 4 1/6 type one widgets. How how we produce 1/6th of a widget?
The way we formulated the problem is great for working out probabilities
for optimal grand strategies for zero-sum games – probabilities can
be arbitrary real numbers. But for the problem that we’re talking about – optimizing profit by producing discrete entities, you need an additional
constraint: the values in the solution must be integers.
For integer solutions, the simplex method goes right out the window. It’s actually an entirely different problem: integer linear programming. The strategies and solutions for solving ILP programs are quite different. Integer linear programming is a much harder problem. Instead of the polynomial time
algorithms that work for general LP, ILP is NP-hard – that is, its decision
problem (asking “Is this the optimal solution?”) is NP-complete; actually computing the answer is even worse.
Anyway, next post, we’ll get back to game theory – how we can set
up a zero-sum game as a linear programming matrix in order to find the
optimal grand strategy.
Also, to add insult to injury – the difference between the optimal LP and ILP solutions can be arbitrarily large.
Suppos there’s a typo. You probably meant – and on work on with – 500 kg of steel. Anyway, nice post, though more of an appetizer. Keep on!
Just to throw it out there (and because I did some work on this my first year of grad school), some methods from ring theory can be used to solve ILP problems. It turns out that solving such a problem is closely related to the problem of finding the right set of elements (a Gröbner basis) that generate a certain ideal.
I’m concerned, though, that one of your requirements is that x1 be greater than 5, but in your solution, x1 equals only 4_1/6. In real life, we’d make 5 Type1’s and operate at a suboptimal level for revenue — not profit 🙂 — but I’m curious as to what happened in the math. Did maxima ignore a condition?
Also, in your example, we see that Type1’s need ~5 times as much input to yield only ~25% more revenue than Type 2’s. Let’s say the factory gets 25kg of both steel and plastic with no minimum building requirements. It’s a no-brainer to make all Type2’s ($3500 in revenue vs. $900).
So we know from the outset that we want to minimize the number of Type1’s we build — in your example, 5 — and use the rest of the plastic to make Type2’s — 20.
Now, you can play with the given amounts to obscure the answer at the outset, but using your numbers, we see that we don’t need a matrix-crunching LP. Just some common sense.
Re #4
if ‘he couldn’t dig much deeper if he so desired’ explain the PhD? Do you suppose he had his dissertation ghost-written and hired a body double to do the defense?
a) I find them to have significant merit.
b) The ‘frequency of his blunders’ (not, I would claim, as frequent as you’re implying) suggest to me that this is a blog, rather than his job. What a coincidence.
I suspect that the post from “website design” is actually spam – given that
(a) It uses “website design” for its username.
(b) It uses a URL that links to a business site selling
website design services.
(c) It doesn’t actually address anything in the post it’s responding to.
Mr./Ms. Website Design: If you contact me by email at the contact address in the blog info to confirm that you’re a real person, I’ll leave your comment. If I don’t hear back from you by tonight, I’ll mark it as spam.
> But, it’s also such a common problem that there are a huge number
> of highly optimized solvers already written; you just grab one
> off the shelf. (There’s a lesson in there; never waste time
> doing work yourself if you know someone else has already done it.)
In other words…. learn the lesson *except* if you think you can do better than what’s out there by programming a more optimized version? 😉
That’s awfully generous of you, Mark, to give it* that much time. Not only does it not actually address anything in the post, that verbiage could be posted on almost *any* blog and appear (!) relevant.
What seems most suspicious is the phrase “this guy’s deluge of articles.” Personally, I like your posting frequency – plenty of time to digest and discuss posts – but there’s no way you could be said to generate a ‘deluge’ in any way. It’s not responding to you; it’s trolling to generate traffic to its site.
Kudos to you for taking the ‘innocent until proven guilty’ road, but for me, the jury is in.
(*I am deliberately using the impersonal pronoun to refer to the subhuman behind that comment.)
Or you find out how to make nanowidgets instead.
Carl:
Actually, I’d say that if someone has already done it, you should just reuse it, unless you’re sure that you can do better, and you’re also sure that the amount of time it will take you to do better is justified by the savings.
(I’ve had a coworker who spent weeks hand-optimizing a routine, which was actually only executed inside of an I/O bound loop. Yeah, they got the routine faster by a factor of three – but they had absolutely zero impact on the actual performance of the code – they just added a few idle cycles to the I/O wait. Weeks of work, for that!
I had another co-worker who believed that he was Mr. Super Hacker Jock, who could always write the most perfectly optimized code in C++. He implemented a sort of searchable cons list structure in C++ – spent months working on it, and then hand-crafting approximate comparison algorithms for determining similarities between trees.
I wrote a prolog program – not to try to upstage him, but to help me understand how things work. The hand-optimized code was so hard to read that I was having trouble following it for some debugging. So I decided to write a prolog version to use as an understanding tool. The prolog code was faster. And that’s a truly astonishing statement: the prolog interpreter I was using sucked; its performance was terrible. But it managed to beat the pants off of the hand-optimized code. The problem ended up relating to memory management issues; the hand-optimized code tried to do stuff with shared subtrees, only copying when necessary. But it ended up doing more copying than not using shared subtrees at all. )
Mark. Good post.
The diff between LP & ILP is very counter intuitive, as are many complex problems. Just had a dialog earlier today with a client for whom I had to pull a heap of research to demonstrate my non-intuitive point (something about over-automation costing more money over the long term)
anyone who thinks common sense is an alternative to thought is living in a very simple world.
I haven’t used Simplex for several decades, but I remember its advantage was when you had more variables than you had equations (or inequalities). But in your example, it was the other way around: 5 inequalities and 2 variables.
A simple application using simultaneous equations gives x1=13(.8r) and x2=36(.1r) with a result of $37,777.77, which is much better than maxima’s $21,667 – and was calculated using a scrap of paper and a pen in about 5 minutes.
Unless my math is wrong … and I’m hoping/dreading that someone will figure out where.
Before anyone points it out, I was using Mark’s written description of the problem (with Odysseus’ typo correction), which translates to :
23X1 + 5X2
That’ll teach me not to use preview!
The less than is messing me up, the 4 equations were supposed to be:
23X1 + 5X2 LE 500
2X1 + 2X2 LE 100
and:
23X1 + 5X2 LE 500
2X1 + 2X2 LE 50
(LE = “less than or equal to”)
I’ll repeat the comment I attempted to make yesterday, which was apparently swallowed by the dreaded SB spam filter.
Just to throw it out there, and connect to another topic Mark wrote about a while back, solving an ILP problem is actually equivalent to a problem in ring theory. Namely, given an ideal in a ring (which can be naturally associated to the ILP problem), find a particular sort of generating set for the ideal (specifically, a Gröbner basis).
This ring theory problem can be solved using the Buchberger algorithm, but I don’t think there are any implementations of Buchberger which can compete with the more common methods of attacking ILP problems (I tried, and failed, to write a speedy version of it myself back in my first year of grad school).
Can someone give a more detailed explanation of how the constraints 4,5,6 were constructed? Especially the Right Hand Side. Also, why don’t 5,6 read:
23×1 + 2×2 leq 500
5×1 + 2×2 leq 50
(leq – is less then or equal)
Alex,
The first constraint states that the amount of steel used to create type 1 widgets, plus the amount of steel used to create type 2 widgets, must be less than or equal to the total daily supply of steel. The second constraint is similar, but for plastic.
When you wade through the typos (apparently Marc *should* have said “It gets a supply of metal and plastic for making widgets: 500 kilograms of steel, and 50 kilograms of plastic”) you get:
23X1 + 5X2 LE 500
2X1 + 2X2 LE 50
The other constraints were to do with the maximum number of widgets that can be produced (X1 + X2 LE 300) and the minimum number of widgets that can be produced (X1 GE 5, X2 GE 2). The object then is to maximize the sales, expressed as 900X1 + 700X2.
My point was that if you treat the main two contraints as equalities (23X1 + 5X2 = 500, 2X1 + 2X2 = 50) you can use simultaneous equations to get the answer X1=20.83r, X2=4.16r which just happens to satisfy the other two contraints, AND which maximizes the sales value.
I just don’t think it’s a particularly good example of Linear Equations or the Simplex method.
Ian, thanks for the clarification.
I really should find a good reference with examples on how to formulate LP from word problems. Because I actually can solve LPs, but with my pure math background I was never taught how to formulate them, it was always said to be the job of people with business degrees.
As for your point of simultaneous equations, the only reason you can do that effectively is because the problem is so small. A practical problem will be impractical to state in a blog, that’s why Mark went for a toy example. A common practical size of problem would have more Xi’s (but not too many, much fewer than constraints at any rate) and hundreds of constraints (if not thousands, if you’re analyzing interplay of many processes and not just one factory). With simultaneous equations you’d have to deal with all of them at the same time. Simplex method (and its ilk) would look at maybe 10 of them at a time, if that, and might not even look at many of them at all if the optimality implicitly can’t break them (something you might not know as a human and sim-equations can’t know as an algorithm).
Also to summarize a bit the point Mark was making with run time analysis. While Simplex can be arbitrarily slow in theory, in any practical application it is incredibly and surprisingly (if you consider the sheer size of the real problems) fast. You can’t make Simplex and Co. slow without contriving a weird problem with no practical meaning.
Alex,
I agree with you mostly. But I believe that sim-equations work when there are as many variables as (in)equalities, and lin-equations work when there are more variables than inequalities. In 1968, I did Simplex by hand. This involved writing everything as matrices. Then you set some of the variables to zero and solve the resulting sim-equations. Repeat until you reach the optimum result.
As to using sim-equations to solve Marc’s problem, it seemed to be the easiest way of doing it if viewed as a problem to be solved, and not an example of Simplex. When the only tool you have with you is a hammer, every problem starts looking like a nail!
Alex,
Thinking about it, I was using Simplex to solve Marc’s problem. My manual Simplex method involves reducing the inequalities to sim-equations by setting some of the variables to zero. But as Marc’s problem has 2 equations and 2 variables, I didn’t need to do that – I already had the sim-equations I needed!
Ian, I don’t think Mark’s point was that you’d want to use Simplex to solve this problem, but that this is how you would solve this problem with Simplex. With more practical applications left as exercise to the reader. After all, he’s not really teaching a course here, just giving a taste.
I take a course on the Simplex method in college last year. Sadly though I have forgotten most of it. We used AMPL, pretty solid program.
Alex:
You’re being much to kind. The fact of the matter is I screwed up, and set up an overconstrained problem for the example, and worse, I was in such a hurry writing the article that I didn’t even notice that it was overconstrained and had only one solution. Basically, it’s a total screwup on my part. But I haven’t had time to go back and set up a proper example. Eventually I’ll find the time, and post a new article explaining how I screwed up, and showing a real example.
(In the meantime, I’ll just leave this here in its wretched glory to show everyone how stupid I can be. But at least I’ve got the integrity to admit to screwing up, and not just pull the article down and pretend that I didn’t screw up. That’s a sort of strange point of pride for me; a lot of the people that I regularly criticize, like the Disco bozos, screw up, and then go and rewrite or delete articles in order to pretend that they never make serious mistakes. If I screw up, I won’t hide it; I’ll take responsibility for being an idiot.)
Marc,
As a friend of mine says, it’s a good example of a bad example 🙂
Fractional units are no problem. Average over several days. Six days will do.
Of course, the problem assumes people will buy them these widgets, but as we know, no one wants SUVs anymore.
I LIKE IT PROGRAM.
Everybody knows implementing simplex correctly is painful more than being left by our beloved one.So my proposal is after finding constraints and objective function[as above]why dont we use graphical method instead of simplex and assist me to know which point will give you maximum value