trendline gaussian

  • Re: trendline gaussian


    Dave,


    :eureka:


    A light goes on. The key thing about curves that are used for probability distributions is that the integral of the function, that gives the area under the graph, totals 1. Now, if we knew how much oil there was in a given reserve we could then fit a curve to the area as well as the production points.


    I have amended the worksheet to include all of the oil production data. It shows that it is difficult to get a Gamma dist to fit, but the normal curve is a very good fit!


    I am not sure what you did to extend the references. The easiest way is to insert a 100 rows between, say, the 20th and 21st row of data and then paste the new values in and delete the extra rows. Then drag the formulae down. This way you do not need to amend any formulae that are dependent on the new data and the graph series will automatically extend.


    Do you have lots of data sets that you wish to 'fit' the curve for? How many? If you do have a lot then we need to find a better way of doing it, however if you have say 10-20 then this method will work. One that that we may want to do is to 'window' the data and only select data for certain years.


    HTH,


    Alan.

  • Re: trendline gaussian


    Alan


    That's a neat trick with inserting rows - I'm learning all the time.


    "if we knew how much oil there was in a given reserve we could then ..."
    Ah - if only.
    The production data from OECD countries are pretty good, because those governments see it as important to keep the markets well informed. But Russian Federation and OPEC sometimes bend the truth for political/strategic reasons. With estimate of reserves, it seems everyone cheats to the limit of credibility and beyond. The US Geological Survey is wildly optimistic but is quoted a lot. (If they were to tell the truth, the markets would panic.) US-EIA make up supplies to match whatever demand sounds good, no matter how incredible that makes the production rates. OECD-IEA says things like "unless we invest US$17 trillion over the next 25 years, there could be a problem" - that's 1.9 billion a day ! The big oil companies wouldn't tell the truth about what's happening in Tajikistan for commercial reasons.


    For the sceptics, Association for the Study of Peak Oil (ASPO), http://www.peakoil.net have former industry geologists and investment bankers that come up with estimates that are quite different.
    One of the ways the industry denounces ASPO's pessimistic figures is to say "look - the production rates don't fit gaussian curves" and they are sort of right apart from the US data set.
    I was trying to get some practical measure of that.
    But then does that really weaken ASPO's case, as the crunch factor is probably the estimated Ultimately Recoverable Reserves.
    BP's Statistical Review of World Energy (2005) is at http:// www.bp.com/genericsection.do?categoryId=92&contentId=7005893 and covers ~85 countries and groupings from 1965. This is not their private commercial data but internationally published figures.


    You might like to go through a few countries and see what you think is the best way to analyse them. OPEC/Arab countries in particular cut production for political reasons during the 'Oil Shocks' so they are nothing like gaussians.


    World cumulative production is 1 trillion barrels, and estimated URR varies from 2 trillion barrels (ASPO) to 3+ trillion (USGS), so if ASPO is right and if production is gaussian (symmetrical about the peak), then we are at peak now or over the next 2 years. On the downslope of production, recession and oil wars are inevitable.


    Dave

  • Re: trendline gaussian


    Dave,


    Quote from Dave.Kimble@Liz

    "if we knew how much oil there was in a given reserve we could then ..."
    Ah - if only.


    OK, we may not know how much oil there is in a given reserve, or more accurately, how much recoverable oil there is but do we have any data of expected recovery and actual recovery for fiields that are closed or closing? Then you may be able to start looking at what the likely recoverable oil is from the changing estimates of recoverable oil over time vs final recoverable oil. I appreciate that this would be sketchy, but it might improve the estimate of total recoverable oil.


    Your comments on how every one fails to report the reserves accurately is quite worrying and one I am becoming more aware of.


    Quote from Dave.Kimble@Liz

    One of the ways the industry denounces ASPO's pessimistic figures is to say "look - the production rates don't fit gaussian curves" and they are sort of right apart from the US data set.


    As I have said before, the use of these curves must be justified, and a good method for that is to make the curve not just fit the production points but the estimated total reserves. I suspect then that it will not be a Gaussian but some other form, such as Gamma.


    Quote from Dave.Kimble@Liz

    You might like to go through a few countries and see what you think is the best way to analyse them. OPEC/Arab countries in particular cut production for political reasons during the 'Oil Shocks' so they are nothing like gaussians


    Would love to do this sort of analysis, but not sure my boss would approve!


    Quote from Dave.Kimble@Liz

    ...so if ASPO is right and if production is gaussian (symmetrical about the peak)...


    This is perhaps the biggest sticking point and reason I would want to reject the Gauusian approach. It may have been a good first estimate, but surely logic says that there will be a long tail into teh future. Remember, it is easy to repeat analysis to a given methodology in an attept to prove or disprove something, but the real innovation is to prove or disprove something with a more accurate theorey/methodology. You could be the one who changes the way the industry does this type of analysis!


    As the cost of recovery of oil goes up and the energy used to recover goes up then oil will only be recovered for non-fuel reasons (such as production of other chemicals and plastics). Therefore the rate of production will decrease for a different reason than now. This assumes that an alternative form of energy is used to power our transport system and, to a lesser extent, electricity generation.


    Quote from Dave.Kimble@Liz

    ...On the downslope of production, recession and oil wars are inevitable.


    A worrying prospect, but perhaps this is the time to invest in alternative energy and technologies....


    Dave

  • Re: trendline gaussian


    Alan


    "[Gaussian] may have been a good first estimate, but surely logic says that there will be a long tail into the future."
    I think there is a good likelihood of it being a short future tail, that is a right-skewed distribution.
    I don't want to sound too certain about this.


    You have acknowledged that not all the oil will be "produced" ( = extracted) as some will be so hard to pump, and/or of such poor quality, that it takes more than barrel of oil's worth of energy to get a barrel of oil out. Some potential wells will never be produced at all because of excessive depth, lack of basic infrastructure, under water, under ice, etc. These factors vary from well to well so it is very hard to say, but it might be as much as half the oil never gets produced. However we have already discounted that, by talking about Ultimately Recoverable Reserves (URR).


    Energy Return on Energy Invested (EROEI) = Energy Output / Energy Input
    so when EROEI < 1 , production ceases. This is not altered by rising oil prices, since it is strictly energy accounting. The industry naturally chooses to exploit the best EROEI fields first, when it has a choice, and they keep very quiet about the EROEIs of their undeveloped fields. Nevertheless, ex-industry analysts say that early EROEIs were around 100, and that we are now down to starting new fields with EROEIs of around 5. Knowing the EROEI of the remaining fields would effectively define URR and solve the question, but of course we don't know these numbers. Yes, new technology effectively increases EROEI and makes unenergetic fields become energetic, but not by all that much (economists would disagree with this, but they will be shown to be wrong when peak happens, as it has in US, UK, Australia, etc).


    So the profile of EROEIs across the world's oilfields is very important. We would expect that there will be a small number of very energetic wells, more at lower EROEI, more still at poor EROEIs, and half at EROEI < 1. To me that hints at an exp(-kt) function.
    Also, the aim of the oil companies is to produce at a rate of supply equal to demand, to give a stable price environment, and demand has increased about 2% per annum for most of last century, which hints at power function (1 + r)^t
    So net energy, which is what we get in oil after discounting what we spent in getting it, is perhaps the product of (EROEI - 1) and Demand.
    This is all really sloppy stuff, but the attached model visualises this.
    Doesn't the Gamma Distribution have the product of a power and an exponential built into it ? Maybe we are onto something, but I can feel you shaking your head, so I will stop there.


    Dave

  • Re: trendline gaussian


    Dave,


    I was not shaking my head- just starting to think I would love to get my head into this type of analysis- however Imust return to looking at how to get a correlation from apparently uncorrelated data. Always a diffiult task when a client (and most people) would assume a 'relationship' but the data says otherwise. In this case I suspect it is the lack of accurate data at sufficient detail.


    I would love to keep talking, but fear I must stop.


    Do you have a solution to your original question to the forum? If so I think we should stop the thread (out of courtesy to the board). If you do not have an answer then I am happy to help.


    Cheers and good luck with the 'dooms day message'!


    A.

  • Re: trendline gaussian


    sorry but may i ask how come ur values for column E is 2.718/....? where does 2.718 come from?


    i found a stats bk and the formula for Gauss Distribution is f(x) = 1/2.506.... so how did 2.718 come abt?

  • Re: trendline gaussian


    2.71 is the special number e.


    (2*PI())^(1/2) = 2.5066. This should come in the denominator. Beware, the standard deviation sigma was forgotten in the first post(s) in this thread.


    Wigi

    Regards,


    Wigi


    Excel MVP 2011-2014


    For more Excel memes: visit http://www.wimgielis.com ==> English articles ==> Excel memes


    -- Topics without [CODE] tags or a meaningful topic title, will be neglected by me (and probably many others as well) --

  • Re: trendline gaussian


    If your stats book does not mention an e in the form for the normal curve, go to the closest bookstore and buy a decent one.


    Wigi

    Regards,


    Wigi


    Excel MVP 2011-2014


    For more Excel memes: visit http://www.wimgielis.com ==> English articles ==> Excel memes


    -- Topics without [CODE] tags or a meaningful topic title, will be neglected by me (and probably many others as well) --

  • Re: trendline gaussian


    we are talking about Normal(or Gauss) distribution here rite?


    what i found off the net is that the formula is


    f(x) = 1/(sqrt 2*pi) exp [-0.5 ((x-mean)/SD)^2]

  • Re: trendline gaussian


    Alan


    Yes, I have a good answer to my question now, lets close.
    Thanks for your interest and patience, I've learned heaps.


    Benj
    e is the base of natural logarithms and a fundamental constant
    y = e^x is the exponential function exp(x)
    dy/dx = e^x
    e = 1 + x + x²/2! + ... x^n/n! ...
    That e


    Dave

  • Re: trendline gaussian


    3 questions...


    1) how is the mean, sd and scaling found? by right, I thot it shoud be the mean will be average of B2 to B57. but the result is totally different.


    2) what is the purpose of the column E and F? I cant arrive at a conclusion.


    3) is i need to use it for other distributions like gamma, weibull etc, can i simply change the formula on column E to that of the respective distributions??


    pls advise...

  • Re: trendline gaussian


    Benj,


    Give us some help here, this is a long post with at least three different Excel files uploaded. Which file are you looking at? What are you actually trying to do?


    Your previous questions suggest that you are using teh Normal distribution. Have you looked at the Excel help for NormDist and NormInv (and then GammaDist and GammaInv)?


    Also be aware that these solutions use the solver within Excel. The solver has limited capability with these functions and there is a need to 'help' teh solver by providing a good starting point.


    How can we help you?


    A.

  • Re: trendline gaussian


    so sorry.. i am refering to the file gaussian-fit.xls...


    for each set of my data, i nd to do it like the excel file named above.. 1 line to depict my input data and the other to show the gaussian curve in this case of the file.


    I have looked at NormDist... but pardon me, i'm completely new to this area, does the results generated by NormDist correspond to the results generated by the gaussian-fit.xls column E(which uses gaussian instead) so wld i be right to say that if i wan the curve generated by Col E to be of the nature of gamma distribution, i simply change the formula of column E to GammaDist and the corresponding inputs?

  • Re: trendline gaussian


    Benj,


    I can not comment much more on taht file. I suggest that you have a look at the file that I posted (Oil.zip) that has been accepted by Dave as a solution to his problem.


    The Normal curve is the same as the Gaussian distribution.


    To determine the mean and stdev you need to first make sure that you are working on the correct data. Can you post a sample of your data and a description of exactly what it is you need to do?


    A.

  • Re: trendline gaussian


    hi i have looked at oil.zip and have a few queries to clarify.


    1) how is alpha, beta, scaling and x-shift obtained?
    2) y must the values from Col L41 onwards be multiplied by M6?
    3) what is the purpose of col M of residual?

  • Re: trendline gaussian


    Benj,


    Quote from benj

    hi i have looked at oil.zip and have a few queries to clarify.
    1) how is alpha, beta, scaling and x-shift obtained?
    2) y must the values from Col L41 onwards be multiplied by M6?
    3) what is the purpose of col M of residual?


    OK, let's take a step back. What do you know about the Gamma function? These questions suggest to me that you have not plotted one before and you are not familiar with transforming data to fit distributions. Also, question 3 suggests that you are not familiar with regression.


    Can you please tell me what you are actually trying to do. There is a danger that you are in a rabbit warren here and I do not want us to get lost.


    What data do you have and what are you trying to do?
    Why Gamma?


    A. :)

  • Re: trendline gaussian


    actually you are right, i'm completely new to this field and clueless about the terminologies...


    i read from Excel Help that GammaDist requires 4 inputs, other than that, i cant really understand the explainations i got from the stats books that i got.


    i have not plotted a gamma function before nor am i familiar wif transforming data to fit distributions or regression.


    What I nd to do here is... i have a set of input values (1000 rows of data) however, I nd to visually see which distribution fits best to the input values that i have. as such i will nd to plot the the curve that holds my input values and ALSO the curve that depicts a standard gamma distribution in this case. *in other cases, i wil need to consider distributions like weibull and beta* the reason for choosing these 3 distributions is the requirements set out by my assignment.


    not sure if i have provided enough info.

  • Re: trendline gaussian


    I had a feeling that this was homework....it is useful if you state this upfront. Not because we will not help you, but because it changes how we explain things and what we will get you to do.


    OK. Have a look at this site and it will provide a description of the gamma distribution.


    You can see that the parameters change the curev significantly.


    I have to go now, but will have anotehr read in the morning.


    A.

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!