[clue] Performance Computing was Re: seeking advice cleaning up root

Raymond DeRoo rderoo at deroo.net
Mon Mar 21 06:10:33 MDT 2011


Jon--

>> Good programmers do not. They take the time to learn enough about the
>> OS to write the most optimal code available.
> 
> "Most optimal code" is in the eye of the beholder.  Often the trade-off
> in saving machine cycles is overwhelmed by the human time lost in doing
> the optimization.

This again depends upon the environment. When working on the scale of tens of thousands of systems, then the human time cost often does become cheaper than "throwing hardware at the problem". I work in an environment where requesting new hardware is easy, if I ask for 20 servers, I got them. But what if I want them *turned on*, then I need to justify the power consumption for each machine. So if it takes a developer an extra week in "saving machine cycles" with the difference being I need to bring 150 less machines online, then that week is easy to justify. But this is also a mind set. We only hire developers who want to work in high performance computing, not all do. Let's take the CLUE website as an example. Drupal is a good and robust CMS, but anyone who's used for large volume sites knows it doesn't scale real well. That doesn't make it bad software, just not the right software for every job. It's perfect for CLUE, and other organization like us who membership is small ( less than a few thousand ) and traffic volumes are light to moderate.


> I will be the first person to state that a good programmer should know
> how a compiler generates code, and may even alter their code so a
> particular compiler (gcc for instance) will generate more optimal code.
> It is one of the reasons why I despair over computer science departments
> that do not teach machine and assembler language.  I do not expect
> people to spend much of their coding career writing assembler, but they
> should understand the trade offs of the computer architectures.

I wholehearted agree. In fact I spoke to professors and facility from all over the US less than a month ago in Cincinnati about this very subject. The conference was on Technology in Higher Education, and my talk was entitled "How Computer Science Programs A Failing Corporations". My opening statement was "I was told when I stated hiring junior programmers I would be choosing from the best of best. The reality is, I choose the one who sucks the least." I went on to detail specific problems I face in hiring, and what I need the students to know.

> However, if we were to all generate "most optimal code" (meaning most
> efficient code) all the time, then we probably would not get much done,
> and probably we should be ignoring the operating system and writing to
> the bare metal.

Our router folks do this very thing. That being said, I don't feel it is the right choice for all cases. I'll take a moment to pick Java developers here too, just so no one feels left out. :) Frequently I see Java code where the developer has not take the time to do proper dependency checking. As such it's not not uncommon to find three or four different versions of the same library included in a project. Or so many layers of abstraction are introduced that the amount of time spent in function call resolution is measurable. To be sure this issue exists for other languages as well. But it is a sign of the programmer wanting to get the project "over and done with" rather than writing "the best code possible". These are choices made by both the programmer and the company. I have worked with shops where idea, design, implementation, test, production, AB test all happen in few as four hours. Lots of stuff the produce gets thrown away, the plus side, not much time was spent on developing it. But if something sticks, and becomes a long term product offering, then it frequently needs to be refactored. 

>> (For the record: Most of us who write in bash, perl, php, python, ruby
>> etc are *scripters* and not programmers.)
> 
> I consider "scripters" to be "programmers", and particularly if it is a
> one-off job.  The shell programs in Unix have been fairly optimized over
> the years to give good performance, and the programmer who recognizes
> the trade-offs between writing, testing and debugging new "C" code to do
> a task versus writing a simple script that does the same thing (to me)
> is the sign of a great programmer, not just a "good" one.

I could be wrong, but to me it seems as if you are mixing a two concepts here: design decisions and programming. If I am instructed to write a program to remove a file from the file system in C, then I will write that program in C. That's my job. Now I'll also say "Hey, there is rm/del command which will do that already." This has less to do with writing optimal code and more to do with design/architecture choices. Now if I'm told to write it in C I can determine the file system type, request the inode ( or equivalent data structure ) seek to the file start offset and flip the associated bit(s) to mark the space as being free. Or I could simply call the unlink() function provided to me. Which one is better? I would guess probably unlink() but I would need to test both with files of about the size needed to me removed. To me, that is programming decision, and it requires me to have knowledge of how file systems to work to know what options are available. Though it seems to me you are largely advocating such knowledge shouldn't be needed, only that knowing to use unlink() is fine. Again, if I have misunderstood what you where trying to convey, please do edify me. 

Finally, not to loose sight of the original request. I had intended to indicate that my recommendations where skewed from a perspective of performance computer. Does symlink resolution performance matter on an laptop? No, not at all. I was pointing out that the solution provided goes against best practice for the software mentioned. And that *my* choice for configuration geared to that which I use every day. Overkill for laptop? Sure, but simplicity for my life in that files related to X are located in Y, no mater which platform I'm working on.

.r


More information about the clue mailing list